pg_upgrade and logical replication

Started by Julien Rouhaudalmost 3 years ago221 messages
#1Julien Rouhaud
rjuju123@gmail.com

Hi,

I was working on testing a major upgrade scenario using a mix of physical and
logical replication when I faced some unexpected problem leading to missing
rows. Note that my motivation is to rely on physical replication / physical
backup to avoid recreating a node from scratch using logical replication, as
the initial sync with logical replication is much more costly and impacting
compared to pg_basebackup / restoring a physical backup, but the same problem
exist if you just pg_upgrade a node that has subscriptions.

The problem is that pg_upgrade creates the subscriptions on the newly upgraded
node using "WITH (connect = false)", which seems expected as you obviously
don't want to try to connect to the publisher at that point. But then once the
newly upgraded node is restarted and ready to replace the previous one, unless
I'm missing something there's absolutely no possibility to use the created
subscriptions without losing some data from the publisher.

The reason is that the subscription doesn't have a local list of relation to
process until you refresh the subscription, but you can't refresh the
subscription without enabling it (and you can't enable it in a transaction),
which means that you have to let the logical worker start, consume and ignore
all changes that happened on the publisher side until the refresh happens.

An easy workaround that I tried is to allow something like

ALTER SUBSCRIPTION ... ENABLE WITH (refresh = true, copy_data = false)

so that the refresh internally happens before the apply worker is started and
you just keep consuming the delta, which works on naive scenario.

One concern I have with this approach is that the default values for both
"refresh" and "copy_data" for all other subcommands is "true, but we would
probably need a different default value in that exact scenario (as we know we
already have the data). I think that it would otherwise be safe in my very
specific scenario, assuming that you created the slot beforehand and moved the
slot's LSN at the promotion point, as even if you add non-empty tables to the
publication you will only need the delta whether those were initially empty or
not given your initial physical replica state. Any other scenario would make
this new option dangerous, if not entirely useless, but not more than any of
the current commands that lead to refreshing a subscription and have the same
options I guess.

All in all, currently the only way to somewhat safely resume logical
replication after a pg_upgrade is to drop all the subscriptions that were
transferred during pg_upgrade on all databases and recreate them (using the
existing slots on the publisher side obviously), allowing the initial
connection. But this approach only works in the exact scenario I mentioned
(physical to logical replication, or at least a case where *all* the tables
where logically replicated prior to the pg_ugprade), otherwise you have to
recreate the follower node from scratch using logical repication.

Is that indeed the current behavior, or did I miss something?

Is this "resume logical replication on pg_upgraded node" something we want to
support better? I was thinking that we could add a new pg_dump mode (maybe
only usable during pg_upgrade) that also restores the pg_subscription_rel
content in each subscription or something like that. If not, should pg_upgrade
keep preserving the subscriptions as it doesn't seem safe to use them, or at
least document the hazards (I didn't find anything about it in the
documentation)?

#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#1)
Re: pg_upgrade and logical replication

On Fri, Feb 17, 2023 at 1:24 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

I was working on testing a major upgrade scenario using a mix of physical and
logical replication when I faced some unexpected problem leading to missing
rows. Note that my motivation is to rely on physical replication / physical
backup to avoid recreating a node from scratch using logical replication, as
the initial sync with logical replication is much more costly and impacting
compared to pg_basebackup / restoring a physical backup, but the same problem
exist if you just pg_upgrade a node that has subscriptions.

The problem is that pg_upgrade creates the subscriptions on the newly upgraded
node using "WITH (connect = false)", which seems expected as you obviously
don't want to try to connect to the publisher at that point. But then once the
newly upgraded node is restarted and ready to replace the previous one, unless
I'm missing something there's absolutely no possibility to use the created
subscriptions without losing some data from the publisher.

The reason is that the subscription doesn't have a local list of relation to
process until you refresh the subscription, but you can't refresh the
subscription without enabling it (and you can't enable it in a transaction),
which means that you have to let the logical worker start, consume and ignore
all changes that happened on the publisher side until the refresh happens.

An easy workaround that I tried is to allow something like

ALTER SUBSCRIPTION ... ENABLE WITH (refresh = true, copy_data = false)

so that the refresh internally happens before the apply worker is started and
you just keep consuming the delta, which works on naive scenario.

One concern I have with this approach is that the default values for both
"refresh" and "copy_data" for all other subcommands is "true, but we would
probably need a different default value in that exact scenario (as we know we
already have the data). I think that it would otherwise be safe in my very
specific scenario, assuming that you created the slot beforehand and moved the
slot's LSN at the promotion point, as even if you add non-empty tables to the
publication you will only need the delta whether those were initially empty or
not given your initial physical replica state.

This point is not very clear. Why would one just need delta even for new tables?

Any other scenario would make
this new option dangerous, if not entirely useless, but not more than any of
the current commands that lead to refreshing a subscription and have the same
options I guess.

All in all, currently the only way to somewhat safely resume logical
replication after a pg_upgrade is to drop all the subscriptions that were
transferred during pg_upgrade on all databases and recreate them (using the
existing slots on the publisher side obviously), allowing the initial
connection. But this approach only works in the exact scenario I mentioned
(physical to logical replication, or at least a case where *all* the tables
where logically replicated prior to the pg_ugprade), otherwise you have to
recreate the follower node from scratch using logical repication.

I think if you dropped and recreated the subscriptions by retaining
old slots, the replication should resume from where it left off before
the upgrade. Which scenario are you concerned about?

Is that indeed the current behavior, or did I miss something?

Is this "resume logical replication on pg_upgraded node" something we want to
support better? I was thinking that we could add a new pg_dump mode (maybe
only usable during pg_upgrade) that also restores the pg_subscription_rel
content in each subscription or something like that. If not, should pg_upgrade
keep preserving the subscriptions as it doesn't seem safe to use them, or at
least document the hazards (I didn't find anything about it in the
documentation)?

There is a mention of this in pg_dump docs. See [1]https://www.postgresql.org/docs/devel/app-pgdump.html (When dumping
logical replication subscriptions ...)

[1]: https://www.postgresql.org/docs/devel/app-pgdump.html

--
With Regards,
Amit Kapila.

#3Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#2)
Re: pg_upgrade and logical replication

Hi,

On Fri, Feb 17, 2023 at 04:12:54PM +0530, Amit Kapila wrote:

On Fri, Feb 17, 2023 at 1:24 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

An easy workaround that I tried is to allow something like

ALTER SUBSCRIPTION ... ENABLE WITH (refresh = true, copy_data = false)

so that the refresh internally happens before the apply worker is started and
you just keep consuming the delta, which works on naive scenario.

One concern I have with this approach is that the default values for both
"refresh" and "copy_data" for all other subcommands is "true, but we would
probably need a different default value in that exact scenario (as we know we
already have the data). I think that it would otherwise be safe in my very
specific scenario, assuming that you created the slot beforehand and moved the
slot's LSN at the promotion point, as even if you add non-empty tables to the
publication you will only need the delta whether those were initially empty or
not given your initial physical replica state.

This point is not very clear. Why would one just need delta even for new tables?

Because in my scenario I'm coming from physical replication, so I know that I
did replicate everything until the promotion LSN. Any table later added in the
publication is either already fully replicated until that LSN on the upgraded
node, so only the delta is needed, or has been created after that LSN. In the
latter case, the entirety of the table will be replicated with the logical
replication as a delta right?

Any other scenario would make
this new option dangerous, if not entirely useless, but not more than any of
the current commands that lead to refreshing a subscription and have the same
options I guess.

All in all, currently the only way to somewhat safely resume logical
replication after a pg_upgrade is to drop all the subscriptions that were
transferred during pg_upgrade on all databases and recreate them (using the
existing slots on the publisher side obviously), allowing the initial
connection. But this approach only works in the exact scenario I mentioned
(physical to logical replication, or at least a case where *all* the tables
where logically replicated prior to the pg_ugprade), otherwise you have to
recreate the follower node from scratch using logical repication.

I think if you dropped and recreated the subscriptions by retaining
old slots, the replication should resume from where it left off before
the upgrade. Which scenario are you concerned about?

I'm concerned about people not coming from physical replication. If you just
had some "normal" logical replication, you can't assume that you already have
all the data from the upstream subscription. If it was modified and a non
empty table is added, you might need to copy the data of part of the tables and
keep replicating for the rest. It's hard to be sure from a user point of view,
and even if you knew you have no way to express it.

Is that indeed the current behavior, or did I miss something?

Is this "resume logical replication on pg_upgraded node" something we want to
support better? I was thinking that we could add a new pg_dump mode (maybe
only usable during pg_upgrade) that also restores the pg_subscription_rel
content in each subscription or something like that. If not, should pg_upgrade
keep preserving the subscriptions as it doesn't seem safe to use them, or at
least document the hazards (I didn't find anything about it in the
documentation)?

There is a mention of this in pg_dump docs. See [1] (When dumping
logical replication subscriptions ...)

Indeed, but it's barely saying "It is then up to the user to reactivate the
subscriptions in a suitable way" and "It might also be appropriate to truncate
the target tables before initiating a new full table copy". As I mentioned, I
don't think there's a suitable way to reactivate the subscription, at least if
you don't want to miss some records, so truncating all target tables is the
only fully safe way to proceed. It seems quite silly to have to do so just
because pg_upgrade doesn't retain the list of relation per subscription.

#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#3)
Re: pg_upgrade and logical replication

On Fri, Feb 17, 2023 at 9:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Fri, Feb 17, 2023 at 04:12:54PM +0530, Amit Kapila wrote:

On Fri, Feb 17, 2023 at 1:24 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

An easy workaround that I tried is to allow something like

ALTER SUBSCRIPTION ... ENABLE WITH (refresh = true, copy_data = false)

so that the refresh internally happens before the apply worker is started and
you just keep consuming the delta, which works on naive scenario.

One concern I have with this approach is that the default values for both
"refresh" and "copy_data" for all other subcommands is "true, but we would
probably need a different default value in that exact scenario (as we know we
already have the data). I think that it would otherwise be safe in my very
specific scenario, assuming that you created the slot beforehand and moved the
slot's LSN at the promotion point, as even if you add non-empty tables to the
publication you will only need the delta whether those were initially empty or
not given your initial physical replica state.

This point is not very clear. Why would one just need delta even for new tables?

Because in my scenario I'm coming from physical replication, so I know that I
did replicate everything until the promotion LSN. Any table later added in the
publication is either already fully replicated until that LSN on the upgraded
node, so only the delta is needed, or has been created after that LSN. In the
latter case, the entirety of the table will be replicated with the logical
replication as a delta right?

That makes sense to me.

Any other scenario would make
this new option dangerous, if not entirely useless, but not more than any of
the current commands that lead to refreshing a subscription and have the same
options I guess.

All in all, currently the only way to somewhat safely resume logical
replication after a pg_upgrade is to drop all the subscriptions that were
transferred during pg_upgrade on all databases and recreate them (using the
existing slots on the publisher side obviously), allowing the initial
connection. But this approach only works in the exact scenario I mentioned
(physical to logical replication, or at least a case where *all* the tables
where logically replicated prior to the pg_ugprade), otherwise you have to
recreate the follower node from scratch using logical repication.

I think if you dropped and recreated the subscriptions by retaining
old slots, the replication should resume from where it left off before
the upgrade. Which scenario are you concerned about?

I'm concerned about people not coming from physical replication. If you just
had some "normal" logical replication, you can't assume that you already have
all the data from the upstream subscription. If it was modified and a non
empty table is added, you might need to copy the data of part of the tables and
keep replicating for the rest. It's hard to be sure from a user point of view,
and even if you knew you have no way to express it.

Can't the user create a separate publication for such newly added
tables and a corresponding new subscription on the downstream node?
Now, I think it would be a bit tricky if the user already has a
publication defined with FOR ALL TABLES. In that case, we probably
need some way to specify FOR ALL TABLES EXCEPT (list of tables) which
we currently don't have.

Is that indeed the current behavior, or did I miss something?

Is this "resume logical replication on pg_upgraded node" something we want to
support better? I was thinking that we could add a new pg_dump mode (maybe
only usable during pg_upgrade) that also restores the pg_subscription_rel
content in each subscription or something like that. If not, should pg_upgrade
keep preserving the subscriptions as it doesn't seem safe to use them, or at
least document the hazards (I didn't find anything about it in the
documentation)?

There is a mention of this in pg_dump docs. See [1] (When dumping
logical replication subscriptions ...)

Indeed, but it's barely saying "It is then up to the user to reactivate the
subscriptions in a suitable way" and "It might also be appropriate to truncate
the target tables before initiating a new full table copy". As I mentioned, I
don't think there's a suitable way to reactivate the subscription, at least if
you don't want to miss some records, so truncating all target tables is the
only fully safe way to proceed. It seems quite silly to have to do so just
because pg_upgrade doesn't retain the list of relation per subscription.

I also don't know if there is any other safe way for newly added
tables apart from the above suggestion to create separate publications
but that can work only in specific cases.

--
With Regards,
Amit Kapila.

#5Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#4)
Re: pg_upgrade and logical replication

On Sat, Feb 18, 2023 at 09:31:30AM +0530, Amit Kapila wrote:

On Fri, Feb 17, 2023 at 9:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

I'm concerned about people not coming from physical replication. If you just
had some "normal" logical replication, you can't assume that you already have
all the data from the upstream subscription. If it was modified and a non
empty table is added, you might need to copy the data of part of the tables and
keep replicating for the rest. It's hard to be sure from a user point of view,
and even if you knew you have no way to express it.

Can't the user create a separate publication for such newly added
tables and a corresponding new subscription on the downstream node?

Yes that seems like a safe way to go, but it relies on users being very careful
if they don't want to get corrupted logical standby, and I think it's
impossible to run any check to make sure that the subscription is adequate?

Now, I think it would be a bit tricky if the user already has a
publication defined with FOR ALL TABLES. In that case, we probably
need some way to specify FOR ALL TABLES EXCEPT (list of tables) which
we currently don't have.

Yes, and note that I rely on FOR ALL TABLES for my original physical to logical
use case.

Indeed, but it's barely saying "It is then up to the user to reactivate the
subscriptions in a suitable way" and "It might also be appropriate to truncate
the target tables before initiating a new full table copy". As I mentioned, I
don't think there's a suitable way to reactivate the subscription, at least if
you don't want to miss some records, so truncating all target tables is the
only fully safe way to proceed. It seems quite silly to have to do so just
because pg_upgrade doesn't retain the list of relation per subscription.

I also don't know if there is any other safe way for newly added
tables apart from the above suggestion to create separate publications
but that can work only in specific cases.

I might be missing something, but what could go wrong if pg_upgrade could emit
a bunch of commands like:

ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';

pg_upgrade already preserves the relation's oid, so we could restore the
exact original state and then enabling the subscription would just work?

We could restrict this form to --binary only so we don't provide a way for
users to mess the data.

#6Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#5)
Re: pg_upgrade and logical replication

On Sat, Feb 18, 2023 at 11:21 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Sat, Feb 18, 2023 at 09:31:30AM +0530, Amit Kapila wrote:

On Fri, Feb 17, 2023 at 9:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

I'm concerned about people not coming from physical replication. If you just
had some "normal" logical replication, you can't assume that you already have
all the data from the upstream subscription. If it was modified and a non
empty table is added, you might need to copy the data of part of the tables and
keep replicating for the rest. It's hard to be sure from a user point of view,
and even if you knew you have no way to express it.

Can't the user create a separate publication for such newly added
tables and a corresponding new subscription on the downstream node?

Yes that seems like a safe way to go, but it relies on users being very careful
if they don't want to get corrupted logical standby, and I think it's
impossible to run any check to make sure that the subscription is adequate?

I can't think of any straightforward way but one can probably take of
dump of data on both nodes using pg_dump and then compare it.

Now, I think it would be a bit tricky if the user already has a
publication defined with FOR ALL TABLES. In that case, we probably
need some way to specify FOR ALL TABLES EXCEPT (list of tables) which
we currently don't have.

Yes, and note that I rely on FOR ALL TABLES for my original physical to logical
use case.

Okay, but if we would have functionality like EXCEPT (list of tables),
one could do ALTER PUBLICATION .. before doing REFRESH on the
subscriber-side.

Indeed, but it's barely saying "It is then up to the user to reactivate the
subscriptions in a suitable way" and "It might also be appropriate to truncate
the target tables before initiating a new full table copy". As I mentioned, I
don't think there's a suitable way to reactivate the subscription, at least if
you don't want to miss some records, so truncating all target tables is the
only fully safe way to proceed. It seems quite silly to have to do so just
because pg_upgrade doesn't retain the list of relation per subscription.

I also don't know if there is any other safe way for newly added
tables apart from the above suggestion to create separate publications
but that can work only in specific cases.

I might be missing something, but what could go wrong if pg_upgrade could emit
a bunch of commands like:

ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';

How will we know the STATE and LSN of each relation? But I think even
if know that what is the guarantee that publisher side still has still
retained the corresponding slots?

--
With Regards,
Amit Kapila.

#7Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#6)
Re: pg_upgrade and logical replication

On Sat, Feb 18, 2023 at 04:12:52PM +0530, Amit Kapila wrote:

On Sat, Feb 18, 2023 at 11:21 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

Now, I think it would be a bit tricky if the user already has a
publication defined with FOR ALL TABLES. In that case, we probably
need some way to specify FOR ALL TABLES EXCEPT (list of tables) which
we currently don't have.

Yes, and note that I rely on FOR ALL TABLES for my original physical to logical
use case.

Okay, but if we would have functionality like EXCEPT (list of tables),
one could do ALTER PUBLICATION .. before doing REFRESH on the
subscriber-side.

Honestly I'm not a huge fan of this approach. It feels hacky to have such a
feature, and doesn't even solve the problem on its own as you still lose
records when reactivating the subscription unless you also provide an ALTER
SUBSCRIPTION ENABLE WITH (refresh = true, copy_data = false), which will
probably require different defaults than the rest of the ALTER SUBSCRIPTION
subcommands that handle a refresh.

Indeed, but it's barely saying "It is then up to the user to reactivate the
subscriptions in a suitable way" and "It might also be appropriate to truncate
the target tables before initiating a new full table copy". As I mentioned, I
don't think there's a suitable way to reactivate the subscription, at least if
you don't want to miss some records, so truncating all target tables is the
only fully safe way to proceed. It seems quite silly to have to do so just
because pg_upgrade doesn't retain the list of relation per subscription.

I also don't know if there is any other safe way for newly added
tables apart from the above suggestion to create separate publications
but that can work only in specific cases.

I might be missing something, but what could go wrong if pg_upgrade could emit
a bunch of commands like:

ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';

How will we know the STATE and LSN of each relation?

In the pg_subscription_rel catalog of the upgraded server? I didn't look in
detail on how information are updated but I'm assuming that if logical
replication survives after a database restart it shouldn't be a problem to also
fully dump it during pg_upgrade.

But I think even
if know that what is the guarantee that publisher side still has still
retained the corresponding slots?

No guarantee, but if you're just doing a pg_upgrade of a logical replica why
would you drop the replication slot? In any case the warning you mentioned in
pg_dump documentation would still apply and you would have to reenable it as
needed, the only difference is that you would actually be able to keep your
logical replication after a pg_upgrade if you need. If you dropped the
replication slot on the publisher side, then simply remove the publications on
the upgraded node too, or create a new one, exactly as you would do with the
current pg_upgrade workflow.

#8Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#7)
Re: pg_upgrade and logical replication

On Sun, Feb 19, 2023 at 5:31 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Sat, Feb 18, 2023 at 04:12:52PM +0530, Amit Kapila wrote:

I also don't know if there is any other safe way for newly added
tables apart from the above suggestion to create separate publications
but that can work only in specific cases.

I might be missing something, but what could go wrong if pg_upgrade could emit
a bunch of commands like:

ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';

How will we know the STATE and LSN of each relation?

In the pg_subscription_rel catalog of the upgraded server? I didn't look in
detail on how information are updated but I'm assuming that if logical
replication survives after a database restart it shouldn't be a problem to also
fully dump it during pg_upgrade.

But I think even
if know that what is the guarantee that publisher side still has still
retained the corresponding slots?

No guarantee, but if you're just doing a pg_upgrade of a logical replica why
would you drop the replication slot? In any case the warning you mentioned in
pg_dump documentation would still apply and you would have to reenable it as
needed, the only difference is that you would actually be able to keep your
logical replication after a pg_upgrade if you need. If you dropped the
replication slot on the publisher side, then simply remove the publications on
the upgraded node too, or create a new one, exactly as you would do with the
current pg_upgrade workflow.

I think the current mechanism tries to provide more flexibility to the
users. OTOH, in some of the cases where users don't want to change
anything in the logical replication (both upstream and downstream
function as it is) after the upgrade then they need to do more work. I
think ideally there should be some option in pg_dump that allows us to
dump the contents of pg_subscription_rel as well, so that is easier
for users to continue replication after the upgrade. We can then use
it for binary-upgrade mode as well.

--
With Regards,
Amit Kapila.

#9Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#8)
Re: pg_upgrade and logical replication

On Mon, Feb 20, 2023 at 11:07:42AM +0530, Amit Kapila wrote:

On Sun, Feb 19, 2023 at 5:31 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

I might be missing something, but what could go wrong if pg_upgrade could emit
a bunch of commands like:

ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';

How will we know the STATE and LSN of each relation?

In the pg_subscription_rel catalog of the upgraded server? I didn't look in
detail on how information are updated but I'm assuming that if logical
replication survives after a database restart it shouldn't be a problem to also
fully dump it during pg_upgrade.

But I think even
if know that what is the guarantee that publisher side still has still
retained the corresponding slots?

No guarantee, but if you're just doing a pg_upgrade of a logical replica why
would you drop the replication slot? In any case the warning you mentioned in
pg_dump documentation would still apply and you would have to reenable it as
needed, the only difference is that you would actually be able to keep your
logical replication after a pg_upgrade if you need. If you dropped the
replication slot on the publisher side, then simply remove the publications on
the upgraded node too, or create a new one, exactly as you would do with the
current pg_upgrade workflow.

I think the current mechanism tries to provide more flexibility to the
users. OTOH, in some of the cases where users don't want to change
anything in the logical replication (both upstream and downstream
function as it is) after the upgrade then they need to do more work. I
think ideally there should be some option in pg_dump that allows us to
dump the contents of pg_subscription_rel as well, so that is easier
for users to continue replication after the upgrade. We can then use
it for binary-upgrade mode as well.

Is there really a use case for dumping the content of pg_subscription_rel
outside of pg_upgrade? I'm not particularly worried about the publisher going
away or changing while pg_upgrade is running , but for a normal pg_dump /
pg_restore I don't really see how anyone would actually want to resume logical
replication from a pg_dump, especially since it's almost guaranteed that the
node will already have consumed data from the publication that won't be in the
dump in the first place.

Are you ok with the suggested syntax above (probably with extra parens to avoid
adding new keywords), or do you have some better suggestion? I'm a bit worried
about adding some O(n) commands, as it can add some noticeable slow-down for
pg_upgrade-ing logical replica, but I don't really see how to avoid that. Note
that if we make this option available to end-users, we will have to use the
relation name rather than its oid, which will make this option even more
expensive when restoring due to the extra lookups.

For the pg_upgrade use-case, do you see any reason to not restore the
pg_subscription_rel by default? Maybe having an option to not restore it would
make sense if it indeed add noticeable overhead when publications have a lot of
tables?

#10Julien Rouhaud
rjuju123@gmail.com
In reply to: Julien Rouhaud (#9)
1 attachment(s)
Re: pg_upgrade and logical replication

On Mon, Feb 20, 2023 at 03:07:37PM +0800, Julien Rouhaud wrote:

On Mon, Feb 20, 2023 at 11:07:42AM +0530, Amit Kapila wrote:

I think the current mechanism tries to provide more flexibility to the
users. OTOH, in some of the cases where users don't want to change
anything in the logical replication (both upstream and downstream
function as it is) after the upgrade then they need to do more work. I
think ideally there should be some option in pg_dump that allows us to
dump the contents of pg_subscription_rel as well, so that is easier
for users to continue replication after the upgrade. We can then use
it for binary-upgrade mode as well.

Is there really a use case for dumping the content of pg_subscription_rel
outside of pg_upgrade? I'm not particularly worried about the publisher going
away or changing while pg_upgrade is running , but for a normal pg_dump /
pg_restore I don't really see how anyone would actually want to resume logical
replication from a pg_dump, especially since it's almost guaranteed that the
node will already have consumed data from the publication that won't be in the
dump in the first place.

Are you ok with the suggested syntax above (probably with extra parens to avoid
adding new keywords), or do you have some better suggestion? I'm a bit worried
about adding some O(n) commands, as it can add some noticeable slow-down for
pg_upgrade-ing logical replica, but I don't really see how to avoid that. Note
that if we make this option available to end-users, we will have to use the
relation name rather than its oid, which will make this option even more
expensive when restoring due to the extra lookups.

For the pg_upgrade use-case, do you see any reason to not restore the
pg_subscription_rel by default? Maybe having an option to not restore it would
make sense if it indeed add noticeable overhead when publications have a lot of
tables?

Since I didn't hear any objection I worked on a POC patch with this approach.

For now when pg_dump is invoked with --binary, it will always emit extra
commands to restore the relation list. This command is only allowed when the
server is started in binary upgrade mode.

The new command is of the form

ALTER SUBSCRIPTION name ADD TABLE (relid = X, state = 'Y', lsn = 'Z/Z')

with the lsn part being optional. I'm not sure if there should be some new
regression test for that, as it would be a bit costly. Note that pg_upgrade of
a logical replica isn't covered by any regression test that I could find.

I did test it manually though, and it fixes my original problem, allowing me to
safely resume logical replication by just re-enabling it. I didn't do any
benchmarking to see how much overhead it adds.

Attachments:

v1-0001-POC-Preserve-the-subscription-relations-during-pg.patchtext/plain; charset=us-asciiDownload
From 18ccb63d223e020fd3027e2ddcbc997eb968c1ba Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Wed, 22 Feb 2023 09:19:32 +0800
Subject: [PATCH v1] POC: Preserve the subscription relations during pg_upgrade

Previously, only the subscription information was preserved.  Without the list
of relations and their state it's impossible to re-enable the subscriptions
without missing some records as the list of relations can only be refreshed
after enabling the subscription (and therefore starting the apply worker).
Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relation are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
additional commands to be able to restore the content of pg_subscription_rel.

This new ALTER SUBSCRIPTION subcommand, usable only during binary upgrade, has
the following syntax:

ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

The relation is identified by its oid, as it's preserved during pg_upgrade.
The lsn is optional, and defaults to NULL / InvalidXLogRecPtr.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 src/backend/commands/subscriptioncmds.c | 57 +++++++++++++++++
 src/backend/parser/gram.y               | 11 ++++
 src/bin/pg_dump/pg_dump.c               | 84 +++++++++++++++++++++++++
 src/bin/pg_dump/pg_dump.h               | 12 ++++
 src/include/nodes/parsenodes.h          |  3 +-
 5 files changed, 166 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..7f2560faf8 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,8 @@
 #define SUBOPT_DISABLE_ON_ERR		0x00000400
 #define SUBOPT_LSN					0x00000800
 #define SUBOPT_ORIGIN				0x00001000
+#define SUBOPT_RELID				0x00002000
+#define SUBOPT_STATE				0x00004000
 
 /* check if the 'val' has 'bits' set */
 #define IsSet(val, bits)  (((val) & (bits)) == (bits))
@@ -90,6 +92,8 @@ typedef struct SubOpts
 	bool		disableonerr;
 	char	   *origin;
 	XLogRecPtr	lsn;
+	Oid			relid;
+	char		state;
 } SubOpts;
 
 static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -324,6 +328,38 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
 			opts->specified_opts |= SUBOPT_LSN;
 			opts->lsn = lsn;
 		}
+		else if (IsSet(supported_opts, SUBOPT_RELID) &&
+				 strcmp(defel->defname, "relid") == 0)
+		{
+			Oid			relid = defGetObjectId(defel);
+
+			if (IsSet(opts->specified_opts, SUBOPT_RELID))
+				errorConflictingDefElem(defel, pstate);
+
+			if (!OidIsValid(relid))
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("invalid relation identifier used")));
+
+			opts->specified_opts |= SUBOPT_RELID;
+			opts->relid = relid;
+		}
+		else if (IsSet(supported_opts, SUBOPT_STATE) &&
+				 strcmp(defel->defname, "state") == 0)
+		{
+			char	   *state_str = defGetString(defel);
+
+			if (IsSet(opts->specified_opts, SUBOPT_STATE))
+				errorConflictingDefElem(defel, pstate);
+
+			if (strlen(state_str) != 1)
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("invalid relation state used")));
+
+			opts->specified_opts |= SUBOPT_STATE;
+			opts->state = defGetString(defel)[0];
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -1341,6 +1377,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 				break;
 			}
 
+		case ALTER_SUBSCRIPTION_ADD_TABLE:
+			{
+				if (!IsBinaryUpgrade)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR)),
+							errmsg("ALTER SUBSCRIPTION ... ADD TABLE is not supported"));
+
+				supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN;
+				parse_subscription_options(pstate, stmt->options,
+										   supported_opts, &opts);
+
+				/* relid and state should always be provided. */
+				Assert(IsSet(opts.specified_opts, SUBOPT_RELID));
+				Assert(IsSet(opts.specified_opts, SUBOPT_STATE));
+
+				AddSubscriptionRelState(subid, opts.relid, opts.state,
+										opts.lsn);
+
+				break;
+			}
+
 		default:
 			elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
 				 stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a0138382a1..0a3448c487 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10670,6 +10670,17 @@ AlterSubscriptionStmt:
 					n->options = $5;
 					$$ = (Node *) n;
 				}
+			/* for binary upgrade only */
+			| ALTER SUBSCRIPTION name ADD_P TABLE definition
+				{
+					AlterSubscriptionStmt *n =
+						makeNode(AlterSubscriptionStmt);
+
+					n->kind = ALTER_SUBSCRIPTION_ADD_TABLE;
+					n->subname = $3;
+					n->options = $6;
+					$$ = (Node *) n;
+				}
 		;
 
 /*****************************************************************************
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..61f54ee549 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4470,6 +4470,69 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionRels
+ *	  get information about the given subscription's relations
+ */
+static SubRelInfo *
+getSubscriptionRels(Archive *fout, Oid subid, int *nrels)
+{
+	SubRelInfo *rels;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i,
+				ntups;
+
+	if (!fout->dopt->binary_upgrade)
+	{
+		*nrels = 0;
+
+		return NULL;
+	}
+
+	query = createPQExpBuffer();
+
+	appendPQExpBuffer(query, "SELECT srrelid, srsubstate, srsublsn "
+								" FROM pg_subscription_rel"
+								" WHERE srsubid = %u", subid);
+
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	*nrels = ntups;
+
+	if (ntups == 0)
+	{
+		rels = NULL;
+		goto cleanup;
+	}
+
+	/*
+	 * Get subscription relation fields.
+	 */
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	rels = pg_malloc(ntups * sizeof(SubRelInfo));
+
+	for (i = 0; i < ntups; i++)
+	{
+		rels[i].srrelid = atooid(PQgetvalue(res, i, i_srrelid));
+		rels[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		rels[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+
+	return rels;
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4607,6 +4670,10 @@ getSubscriptions(Archive *fout)
 			pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
 
+		subinfo[i].subrels = getSubscriptionRels(fout,
+												 subinfo[i].dobj.catId.oid,
+												 &subinfo[i].nrels);
+
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
 	}
@@ -4690,6 +4757,22 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 	appendPQExpBufferStr(query, ");\n");
 
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		for (i = 0; i < subinfo->nrels; i++)
+		{
+			appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
+									 "(RELID = %u, STATE = '%c'",
+									 qsubname,
+									 subinfo->subrels[i].srrelid,
+									 subinfo->subrels[i].srsubstate);
+
+			if (subinfo->subrels[i].srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", LSN = '%s'",
+								  subinfo->subrels[i].srsublsn);
+
+			appendPQExpBufferStr(query, ");");
+		}
+
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
 								  .owner = subinfo->rolname,
@@ -4697,6 +4780,7 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 								  .section = SECTION_POST_DATA,
 								  .createStmt = query->data,
 								  .dropStmt = delq->data));
+	}
 
 	if (subinfo->dobj.dump & DUMP_COMPONENT_COMMENT)
 		dumpComment(fout, "SUBSCRIPTION", qsubname,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..03fb0dafe0 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -646,6 +646,16 @@ typedef struct _PublicationSchemaInfo
 	NamespaceInfo *pubschema;
 } PublicationSchemaInfo;
 
+/*
+ * The SubRelInfo struct is used to represent subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	Oid		srrelid;
+	char	srsubstate;
+	char   *srsublsn;
+} SubRelInfo;
+
 /*
  * The SubscriptionInfo struct is used to represent subscription.
  */
@@ -662,6 +672,8 @@ typedef struct _SubscriptionInfo
 	char	   *suborigin;
 	char	   *subsynccommit;
 	char	   *subpublications;
+	int			nrels;
+	SubRelInfo *subrels;
 } SubscriptionInfo;
 
 /*
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index f7d7f10f7d..8f66307287 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3917,7 +3917,8 @@ typedef enum AlterSubscriptionType
 	ALTER_SUBSCRIPTION_DROP_PUBLICATION,
 	ALTER_SUBSCRIPTION_REFRESH,
 	ALTER_SUBSCRIPTION_ENABLED,
-	ALTER_SUBSCRIPTION_SKIP
+	ALTER_SUBSCRIPTION_SKIP,
+	ALTER_SUBSCRIPTION_ADD_TABLE
 } AlterSubscriptionType;
 
 typedef struct AlterSubscriptionStmt
-- 
2.37.0

#11Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#10)
Re: pg_upgrade and logical replication

On Wed, Feb 22, 2023 at 12:13 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Mon, Feb 20, 2023 at 03:07:37PM +0800, Julien Rouhaud wrote:

On Mon, Feb 20, 2023 at 11:07:42AM +0530, Amit Kapila wrote:

I think the current mechanism tries to provide more flexibility to the
users. OTOH, in some of the cases where users don't want to change
anything in the logical replication (both upstream and downstream
function as it is) after the upgrade then they need to do more work. I
think ideally there should be some option in pg_dump that allows us to
dump the contents of pg_subscription_rel as well, so that is easier
for users to continue replication after the upgrade. We can then use
it for binary-upgrade mode as well.

Is there really a use case for dumping the content of pg_subscription_rel
outside of pg_upgrade?

I think the users who want to take a dump and restore the entire
cluster may need it there for the same reason as pg_upgrade needs it.
TBH, I have not seen such a request but this is what I imagine one
would expect if we provide this functionality via pg_upgrade.

I'm not particularly worried about the publisher going
away or changing while pg_upgrade is running , but for a normal pg_dump /
pg_restore I don't really see how anyone would actually want to resume logical
replication from a pg_dump, especially since it's almost guaranteed that the
node will already have consumed data from the publication that won't be in the
dump in the first place.

Are you ok with the suggested syntax above (probably with extra parens to avoid
adding new keywords), or do you have some better suggestion? I'm a bit worried
about adding some O(n) commands, as it can add some noticeable slow-down for
pg_upgrade-ing logical replica, but I don't really see how to avoid that. Note
that if we make this option available to end-users, we will have to use the
relation name rather than its oid, which will make this option even more
expensive when restoring due to the extra lookups.

For the pg_upgrade use-case, do you see any reason to not restore the
pg_subscription_rel by default?

As I said earlier, one can very well say that giving more flexibility
(in terms of where the publications will be after restore) after a
restore is a better idea. Also, we are doing the same till now without
any major complaints about the same, so it makes sense to keep the
current behavior as default.

Maybe having an option to not restore it would
make sense if it indeed add noticeable overhead when publications have a lot of
tables?

Yeah, that could be another reason to not do it default.

Since I didn't hear any objection I worked on a POC patch with this approach.

For now when pg_dump is invoked with --binary, it will always emit extra
commands to restore the relation list. This command is only allowed when the
server is started in binary upgrade mode.

The new command is of the form

ALTER SUBSCRIPTION name ADD TABLE (relid = X, state = 'Y', lsn = 'Z/Z')

with the lsn part being optional.

BTW, do we restore the origin and its LSN after the upgrade? Because
without that this won't be sufficient as that is required for apply
worker to ensure that it is in sync with table sync workers.

--
With Regards,
Amit Kapila.

#12Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#11)
Re: pg_upgrade and logical replication

On Sat, Feb 25, 2023 at 11:24:17AM +0530, Amit Kapila wrote:

On Wed, Feb 22, 2023 at 12:13 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Is there really a use case for dumping the content of pg_subscription_rel
outside of pg_upgrade?

I think the users who want to take a dump and restore the entire
cluster may need it there for the same reason as pg_upgrade needs it.
TBH, I have not seen such a request but this is what I imagine one
would expect if we provide this functionality via pg_upgrade.

But the pg_subscription_rel data are only needed if you want to resume logical
replication from the exact previous state, otherwise you can always refresh the
subscription and it will retrieve the list of relations automatically (dealing
with initial sync and so on). It's hard to see how it could be happening with
a plain pg_dump.

The only usable scenario I can see would be to disable all subscriptions on the
logical replica, maybe make sure that no one does any write those tables if you
want to eventually switch over on the restored node, do a pg_dump(all), restore
it and then resume the logical replication / subscription(s) on the restored
server. That's a lot of constraints for something that pg_upgrade deals with
so much more efficiently. Maybe one plausible use case would be to split a
single logical replica to N servers, one per database / publication or
something like that. In that case pg_upgrade won't be that useful and if each
target subset is small enough a pg_dump/pg_restore may be a viable option. But
if that's a viable option then surely creating the logical replica from scratch
using normal logical table sync should be an even better option.

I'm really worried that it's going to be a giant foot-gun that any user should
really avoid.

For the pg_upgrade use-case, do you see any reason to not restore the
pg_subscription_rel by default?

As I said earlier, one can very well say that giving more flexibility
(in terms of where the publications will be after restore) after a
restore is a better idea. Also, we are doing the same till now without
any major complaints about the same, so it makes sense to keep the
current behavior as default.

I'm a bit dubious that anyone actually tried to run pg_upgrade on a logical
replica and then kept using logical replication, as it's currently impossible
to safely resume replication without truncating all target relations.

As I mentioned before, if we keep the current behavior as a default there
should be an explicit warning in the documentation stating that you need to
truncate all target relations before resuming logical replication as otherwise
you have a guarantee that you will lose data.

Maybe having an option to not restore it would
make sense if it indeed add noticeable overhead when publications have a lot of
tables?

Yeah, that could be another reason to not do it default.

I will do some benchmark with various number of relations, from high to
unreasonable.

Since I didn't hear any objection I worked on a POC patch with this approach.

For now when pg_dump is invoked with --binary, it will always emit extra
commands to restore the relation list. This command is only allowed when the
server is started in binary upgrade mode.

The new command is of the form

ALTER SUBSCRIPTION name ADD TABLE (relid = X, state = 'Y', lsn = 'Z/Z')

with the lsn part being optional.

BTW, do we restore the origin and its LSN after the upgrade? Because
without that this won't be sufficient as that is required for apply
worker to ensure that it is in sync with table sync workers.

We currently don't, which is yet another sign that no one actually tried to
resume logical replication after a pg_upgrade. That being said, trying to
pg_upgrade a node that's currently syncing relations seems like a bad idea
(I didn't even think to try), but I guess it should also be supported. I will
work on that too. Assuming we add a new option for controlling either plain
pg_dump and/or pg_upgrade behavior, should this option control both
pg_subscription_rel and replication origins and their data or do we need more
granularity?

#13Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#12)
Re: pg_upgrade and logical replication

On Sun, Feb 26, 2023 at 8:35 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Sat, Feb 25, 2023 at 11:24:17AM +0530, Amit Kapila wrote:

The new command is of the form

ALTER SUBSCRIPTION name ADD TABLE (relid = X, state = 'Y', lsn = 'Z/Z')

with the lsn part being optional.

BTW, do we restore the origin and its LSN after the upgrade? Because
without that this won't be sufficient as that is required for apply
worker to ensure that it is in sync with table sync workers.

We currently don't, which is yet another sign that no one actually tried to
resume logical replication after a pg_upgrade. That being said, trying to
pg_upgrade a node that's currently syncing relations seems like a bad idea
(I didn't even think to try), but I guess it should also be supported. I will
work on that too. Assuming we add a new option for controlling either plain
pg_dump and/or pg_upgrade behavior, should this option control both
pg_subscription_rel and replication origins and their data or do we need more
granularity?

My vote would be to have one option for both. BTW, thinking some more
on this, how will we allow to continue replication after upgrading the
publisher? During upgrade, we don't retain slots, so the replication
won't continue. I think after upgrading subscriber-node, user will
need to upgrade the publisher as well.

--
With Regards,
Amit Kapila.

#14Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#13)
Re: pg_upgrade and logical replication

On Mon, Feb 27, 2023 at 03:39:18PM +0530, Amit Kapila wrote:

BTW, thinking some more
on this, how will we allow to continue replication after upgrading the
publisher? During upgrade, we don't retain slots, so the replication
won't continue. I think after upgrading subscriber-node, user will
need to upgrade the publisher as well.

The scenario I'm interested in is to rely on logical replication only for the
upgrade, so the end state (and start state) is to go back to physical
replication. In that case, I would just create new physical replica from the
pg_upgrade'd server and failover to that node, or rsync the previous publisher
node to make it a physical replica.

But even if you want to only rely on logical replication, I'm not sure why you
would want to keep the publisher node as a publisher node? I think that doing
it this way will lead to a longer downtime compared to doing a failover on the
pg_upgrade'd node, make it a publisher and then move the former publisher node
to a subscriber.

#15Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#14)
Re: pg_upgrade and logical replication

On Tue, Feb 28, 2023 at 7:55 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Mon, Feb 27, 2023 at 03:39:18PM +0530, Amit Kapila wrote:

BTW, thinking some more
on this, how will we allow to continue replication after upgrading the
publisher? During upgrade, we don't retain slots, so the replication
won't continue. I think after upgrading subscriber-node, user will
need to upgrade the publisher as well.

The scenario I'm interested in is to rely on logical replication only for the
upgrade, so the end state (and start state) is to go back to physical
replication. In that case, I would just create new physical replica from the
pg_upgrade'd server and failover to that node, or rsync the previous publisher
node to make it a physical replica.

But even if you want to only rely on logical replication, I'm not sure why you
would want to keep the publisher node as a publisher node? I think that doing
it this way will lead to a longer downtime compared to doing a failover on the
pg_upgrade'd node, make it a publisher and then move the former publisher node
to a subscriber.

I am not sure if this is usually everyone follows because it sounds
like a lot of work to me. IIUC, to achieve this, one needs to recreate
all the publications and subscriptions after changing the roles of
publisher and subscriber. Can you please write steps to show exactly
what you have in mind to avoid any misunderstanding?

--
With Regards,
Amit Kapila.

#16Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#15)
Re: pg_upgrade and logical replication

On Tue, Feb 28, 2023 at 08:56:37AM +0530, Amit Kapila wrote:

On Tue, Feb 28, 2023 at 7:55 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

The scenario I'm interested in is to rely on logical replication only for the
upgrade, so the end state (and start state) is to go back to physical
replication. In that case, I would just create new physical replica from the
pg_upgrade'd server and failover to that node, or rsync the previous publisher
node to make it a physical replica.

But even if you want to only rely on logical replication, I'm not sure why you
would want to keep the publisher node as a publisher node? I think that doing
it this way will lead to a longer downtime compared to doing a failover on the
pg_upgrade'd node, make it a publisher and then move the former publisher node
to a subscriber.

I am not sure if this is usually everyone follows because it sounds
like a lot of work to me. IIUC, to achieve this, one needs to recreate
all the publications and subscriptions after changing the roles of
publisher and subscriber. Can you please write steps to show exactly
what you have in mind to avoid any misunderstanding?

Well, as I mentioned I'm *not* interested in a logical-replication-only
scenario. Logical replication is nice but it will always be less efficient
than physical replication, and some workloads also don't really play well with
it. So while it can be a huge asset in some cases I'm for now looking at
leveraging logical replication for the purpose of major upgrade only for a
physical replication cluster, so the publications and subscriptions are only
temporary and trashed after use.

That being said I was only saying that if I had to do a major upgrade of a
logical replication cluster this is probably how I would try to do it, to
minimize downtime, even if there are probably *a lot* difficulties to
overcome.

#17Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Julien Rouhaud (#3)
Re: pg_upgrade and logical replication

On Fri, Feb 17, 2023 at 7:35 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

Any table later added in the
publication is either already fully replicated until that LSN on the upgraded
node, so only the delta is needed, or has been created after that LSN. In the
latter case, the entirety of the table will be replicated with the logical
replication as a delta right?

What if we consider a slightly adjusted procedure?

0. Temporarily, forbid running any DDL on the source cluster.
1. On the source, create publication, replication slot and remember
the LSN for it
2. Restore the target cluster to that LSN using restore_target_lsn (PITR)
3. Run pg_upgrade on the target cluster
4. Only now, create subscription to target
5. Wait until logical replication catches up
6. Perform a switchover to the new cluster taking care of lags in sequences, etc
7. Resume DDL when needed

Do you see any data loss happening in this approach?

#18Julien Rouhaud
rjuju123@gmail.com
In reply to: Nikolay Samokhvalov (#17)
Re: pg_upgrade and logical replication

On Tue, Feb 28, 2023 at 08:02:13AM -0800, Nikolay Samokhvalov wrote:

On Fri, Feb 17, 2023 at 7:35 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

Any table later added in the
publication is either already fully replicated until that LSN on the upgraded
node, so only the delta is needed, or has been created after that LSN. In the
latter case, the entirety of the table will be replicated with the logical
replication as a delta right?

What if we consider a slightly adjusted procedure?

0. Temporarily, forbid running any DDL on the source cluster.

This is (at least for me) a non starter, as I want an approach that doesn't
impact the primary node, at least not too much.

Also, how would you do that? If you need some new infrastructure it means that
you can only upgrade nodes starting from pg16+, while my approach can upgrade
any node that supports publications as long as the target version is pg16+.

It also raises some concerns: why prevent any DDL while e.g. creating a
temporary table shouldn't not be a problem, same for renaming some underlying
object, adding indexes... You would have to curate a list of what exactly is
allowed which is never great.

Also, how exactly would you ensure that indeed DDL were forbidden since a long
enough point in time rather than just "currently" forbidden at the time you do
some check?

#19Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#16)
Re: pg_upgrade and logical replication

On Tue, Feb 28, 2023 at 10:18 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Tue, Feb 28, 2023 at 08:56:37AM +0530, Amit Kapila wrote:

On Tue, Feb 28, 2023 at 7:55 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

The scenario I'm interested in is to rely on logical replication only for the
upgrade, so the end state (and start state) is to go back to physical
replication. In that case, I would just create new physical replica from the
pg_upgrade'd server and failover to that node, or rsync the previous publisher
node to make it a physical replica.

But even if you want to only rely on logical replication, I'm not sure why you
would want to keep the publisher node as a publisher node? I think that doing
it this way will lead to a longer downtime compared to doing a failover on the
pg_upgrade'd node, make it a publisher and then move the former publisher node
to a subscriber.

I am not sure if this is usually everyone follows because it sounds
like a lot of work to me. IIUC, to achieve this, one needs to recreate
all the publications and subscriptions after changing the roles of
publisher and subscriber. Can you please write steps to show exactly
what you have in mind to avoid any misunderstanding?

Well, as I mentioned I'm *not* interested in a logical-replication-only
scenario. Logical replication is nice but it will always be less efficient
than physical replication, and some workloads also don't really play well with
it. So while it can be a huge asset in some cases I'm for now looking at
leveraging logical replication for the purpose of major upgrade only for a
physical replication cluster, so the publications and subscriptions are only
temporary and trashed after use.

That being said I was only saying that if I had to do a major upgrade of a
logical replication cluster this is probably how I would try to do it, to
minimize downtime, even if there are probably *a lot* difficulties to
overcome.

Okay, but it would be better if you list out your detailed steps. It
would be useful to support the new mechanism in this area if others
also find your steps to upgrade useful.

--
With Regards,
Amit Kapila.

#20Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#19)
Re: pg_upgrade and logical replication

On Wed, Mar 01, 2023 at 11:51:49AM +0530, Amit Kapila wrote:

On Tue, Feb 28, 2023 at 10:18 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

Well, as I mentioned I'm *not* interested in a logical-replication-only
scenario. Logical replication is nice but it will always be less efficient
than physical replication, and some workloads also don't really play well with
it. So while it can be a huge asset in some cases I'm for now looking at
leveraging logical replication for the purpose of major upgrade only for a
physical replication cluster, so the publications and subscriptions are only
temporary and trashed after use.

That being said I was only saying that if I had to do a major upgrade of a
logical replication cluster this is probably how I would try to do it, to
minimize downtime, even if there are probably *a lot* difficulties to
overcome.

Okay, but it would be better if you list out your detailed steps. It
would be useful to support the new mechanism in this area if others
also find your steps to upgrade useful.

Sure. Here are the overly detailed steps:

1) setup a normal physical replication cluster (pg_basebackup, restoring PITR,
whatever), let's call the primary node "A" and replica node "B"
2) ensure WAL level is "logical" on the primary node A
3) create a logical replication slot on every (connectable) database (or just
the one you're interested in if you don't want to preserve everything) on A
4) create a FOR ALL TABLE publication (again for every databases or just the
one you're interested in)
5) wait for replication to be reasonably if not entirely up to date
6) promote the standby node B
7) retrieve the promotion LSN (from the XXXXXXXX.history file,
pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn()...)
8) call pg_replication_slot_advance() with that LSN for all previously created
logical replication slots on A
9) create a normal subscription on all wanted databases on the promoted node
10) wait for it to catchup if needed on B
12) stop the node B
13) run pg_upgrade on B, creating the new node C
14) start C, run the global ANALYZE and any sanity check needed (hopefully you
would have validated that your application is compatible with that new
version before this point)
15) re-enable the subscription on C. This is currently not possible without
losing data, the patch fixes that
16) wait for it to catchup if needed
17) create any missing relation and do the ALTER SUBSCRIPTION ... REFRESH if
needed
18) trash B
19) create new nodes D, E... as physical replica from C if needed, possibly
using cheaper approach like pg_start_backup() / rsync / pg_stop_backup if
needed
20) switchover to C and trash A (or convert it to another replica if you want)
21) trash the publications on C on all databases

As noted the step 15 is currently problematic, and is also problematic in any
variation of that scenario that doesn't require you to entirely recreate the
node C from scratch using logical replication, which is what I want to avoid.

This isn't terribly complicated but requires to be really careful if you don't
want to end up with an incorrect node C. This approach is also currently not
entirely ideal, but hopefully logical replication of sequences and DDL will
remove the main sources of downtime when upgrading using logical replication.

My ultimate goal is to provide some tooling to do that in a much simpler way.
Maybe a new "promote to logical" action that would take care of steps 2 to 9.
Users would therefore only have to do this "promotion to logical", and then run
pg_upgrade and create a new physical replication cluster if they want.

#21Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Julien Rouhaud (#18)
Re: pg_upgrade and logical replication

On Tue, Feb 28, 2023 at 4:43 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Tue, Feb 28, 2023 at 08:02:13AM -0800, Nikolay Samokhvalov wrote:

0. Temporarily, forbid running any DDL on the source cluster.

This is (at least for me) a non starter, as I want an approach that doesn't
impact the primary node, at least not too much.

...

Also, how exactly would you ensure that indeed DDL were forbidden since a long
enough point in time rather than just "currently" forbidden at the time you do
some check?

Thanks for your response. I didn't expect that DDL part would attract
attention, my message was not about DDL... – the DDL part was there
just to show that the recipe I described is possible for any PG
version that supports logical replication.

Usually, people perform upgrades involving logical using full
initialization at logical level – at least all posts and articles I
could talk about that. Meanwhile, on one hand, for large DBs, logical
copying is hard (slow, holding xmin horizon, etc.), and on the other
hand, physical replica can be transformed to logical (using the trick
with recover_target_lsn, syncing the state with the slot's LSN) and
initialization at physical level works much better for large
databases. But there is a problem with logical replication when we run
pg_upgrade – as discussed in this thread. So I just wanted to mention
that if we change the order of actions and first run pg_upgrade, and
only then create publication, there should not be a problem anymore.

#22Julien Rouhaud
rjuju123@gmail.com
In reply to: Nikolay Samokhvalov (#21)
Re: pg_upgrade and logical replication

On Wed, Mar 01, 2023 at 07:56:47AM -0800, Nikolay Samokhvalov wrote:

On Tue, Feb 28, 2023 at 4:43 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Tue, Feb 28, 2023 at 08:02:13AM -0800, Nikolay Samokhvalov wrote:

0. Temporarily, forbid running any DDL on the source cluster.

This is (at least for me) a non starter, as I want an approach that doesn't
impact the primary node, at least not too much.

...

Also, how exactly would you ensure that indeed DDL were forbidden since a long
enough point in time rather than just "currently" forbidden at the time you do
some check?

Thanks for your response. I didn't expect that DDL part would attract
attention, my message was not about DDL... – the DDL part was there
just to show that the recipe I described is possible for any PG
version that supports logical replication.

Well, yes but I already mentioned that in my original email as "dropping all
subscriptions and recreating them" is obviously the same as simply creating
them later. I don't even think that preventing DDL is necessary.

One really important detail you forgot though is that you need to create the
subscription using "copy_data = false". Not hard to do, but that's not the
default so it's yet another trap users can fall into when trying to do a major
version upgrade that can lead to a corrupted logical replica.

#23Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#20)
Re: pg_upgrade and logical replication

On Wed, Mar 1, 2023 at 12:25 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Wed, Mar 01, 2023 at 11:51:49AM +0530, Amit Kapila wrote:

On Tue, Feb 28, 2023 at 10:18 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

Okay, but it would be better if you list out your detailed steps. It
would be useful to support the new mechanism in this area if others
also find your steps to upgrade useful.

Sure. Here are the overly detailed steps:

1) setup a normal physical replication cluster (pg_basebackup, restoring PITR,
whatever), let's call the primary node "A" and replica node "B"
2) ensure WAL level is "logical" on the primary node A
3) create a logical replication slot on every (connectable) database (or just
the one you're interested in if you don't want to preserve everything) on A
4) create a FOR ALL TABLE publication (again for every databases or just the
one you're interested in)
5) wait for replication to be reasonably if not entirely up to date
6) promote the standby node B
7) retrieve the promotion LSN (from the XXXXXXXX.history file,
pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn()...)
8) call pg_replication_slot_advance() with that LSN for all previously created
logical replication slots on A

How are these slots used? Do subscriptions use these slots?

9) create a normal subscription on all wanted databases on the promoted node
10) wait for it to catchup if needed on B
12) stop the node B
13) run pg_upgrade on B, creating the new node C
14) start C, run the global ANALYZE and any sanity check needed (hopefully you
would have validated that your application is compatible with that new
version before this point)
15) re-enable the subscription on C. This is currently not possible without
losing data, the patch fixes that
16) wait for it to catchup if needed
17) create any missing relation and do the ALTER SUBSCRIPTION ... REFRESH if
needed
18) trash B
19) create new nodes D, E... as physical replica from C if needed, possibly
using cheaper approach like pg_start_backup() / rsync / pg_stop_backup if
needed
20) switchover to C and trash A (or convert it to another replica if you want)
21) trash the publications on C on all databases

As noted the step 15 is currently problematic, and is also problematic in any
variation of that scenario that doesn't require you to entirely recreate the
node C from scratch using logical replication, which is what I want to avoid.

This isn't terribly complicated but requires to be really careful if you don't
want to end up with an incorrect node C. This approach is also currently not
entirely ideal, but hopefully logical replication of sequences and DDL will
remove the main sources of downtime when upgrading using logical replication.

I think there are good chances that one can make mistakes following
all the above steps unless she is an expert.

My ultimate goal is to provide some tooling to do that in a much simpler way.
Maybe a new "promote to logical" action that would take care of steps 2 to 9.
Users would therefore only have to do this "promotion to logical", and then run
pg_upgrade and create a new physical replication cluster if they want.

Why don't we try to support the direct upgrade of logical replication
nodes? Have you tried to analyze what are the obstacles and whether we
can have solutions for those? For example, one of the challenges is to
support the upgrade of slots, can we copy (from the old cluster) and
recreate them in the new cluster by resetting LSNs? We can also reset
origins during the upgrade of subscribers and recommend to first
upgrade the subscriber node.

--
With Regards,
Amit Kapila.

#24Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#23)
Re: pg_upgrade and logical replication

On Thu, Mar 02, 2023 at 03:47:53PM +0530, Amit Kapila wrote:

On Wed, Mar 1, 2023 at 12:25 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

1) setup a normal physical replication cluster (pg_basebackup, restoring PITR,
whatever), let's call the primary node "A" and replica node "B"
2) ensure WAL level is "logical" on the primary node A
3) create a logical replication slot on every (connectable) database (or just
the one you're interested in if you don't want to preserve everything) on A
4) create a FOR ALL TABLE publication (again for every databases or just the
one you're interested in)
5) wait for replication to be reasonably if not entirely up to date
6) promote the standby node B
7) retrieve the promotion LSN (from the XXXXXXXX.history file,
pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn()...)
8) call pg_replication_slot_advance() with that LSN for all previously created
logical replication slots on A

How are these slots used? Do subscriptions use these slots?

Yes, as this is the only way to make sure that you replicate everything since
the promotion, and only once. To be more precise, something like that:

CREATE SUBSCRIPTION db_xxx_subscription
CONNECTION 'dbname=db_xxx user=...'
PUBLICATION sub_for_db_xxx
WITH (create_slot = false,
slot_name = 'slot_for_db_xxx',
copy_data = false);

9) create a normal subscription on all wanted databases on the promoted node
10) wait for it to catchup if needed on B
12) stop the node B
13) run pg_upgrade on B, creating the new node C
14) start C, run the global ANALYZE and any sanity check needed (hopefully you
would have validated that your application is compatible with that new
version before this point)
15) re-enable the subscription on C. This is currently not possible without
losing data, the patch fixes that
16) wait for it to catchup if needed
17) create any missing relation and do the ALTER SUBSCRIPTION ... REFRESH if
needed
18) trash B
19) create new nodes D, E... as physical replica from C if needed, possibly
using cheaper approach like pg_start_backup() / rsync / pg_stop_backup if
needed
20) switchover to C and trash A (or convert it to another replica if you want)
21) trash the publications on C on all databases

As noted the step 15 is currently problematic, and is also problematic in any
variation of that scenario that doesn't require you to entirely recreate the
node C from scratch using logical replication, which is what I want to avoid.

This isn't terribly complicated but requires to be really careful if you don't
want to end up with an incorrect node C. This approach is also currently not
entirely ideal, but hopefully logical replication of sequences and DDL will
remove the main sources of downtime when upgrading using logical replication.

I think there are good chances that one can make mistakes following
all the above steps unless she is an expert.

Assuming we do fix pg_upgrade behavior with subscriptions, there isn't much
room for error compared to other scenario:

- pg_upgrade has been there for ages and contains a lot of sanity checks.
People already use it and AFAIK it's not a major pain point, apart from the
cases where it can be slow
- ALTER SUBSCRIPTIOn ... REFRESH will complain if tables are missing locally
- similarly, the logical replica will complain if you're missing some other DDL
locally
- you only create replica if you had some in the first place, so it's something
you should already know how to do. If not, you didn't have any before the
upgrade and you still won't have after

My ultimate goal is to provide some tooling to do that in a much simpler way.
Maybe a new "promote to logical" action that would take care of steps 2 to 9.
Users would therefore only have to do this "promotion to logical", and then run
pg_upgrade and create a new physical replication cluster if they want.

Why don't we try to support the direct upgrade of logical replication
nodes? Have you tried to analyze what are the obstacles and whether we
can have solutions for those? For example, one of the challenges is to
support the upgrade of slots, can we copy (from the old cluster) and
recreate them in the new cluster by resetting LSNs? We can also reset
origins during the upgrade of subscribers and recommend to first
upgrade the subscriber node.

I'm not sure I get your question. This whole thread is about direct upgrade of
logical replication nodes, at least the subscribers, and what is currently
preventing it.

For the publisher nodes, that may be something nice to support (I'm assuming it
could be useful for more complex replication setups) but I'm not interested in
that at the moment as my goal is to reduce downtime for major upgrade of
physical replica, thus *not* doing pg_upgrade of the primary node, whether
physical or logical. I don't see why it couldn't be done later on, if/when
someone has a use case for it.

#25Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#24)
Re: pg_upgrade and logical replication

On Thu, Mar 2, 2023 at 4:21 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Thu, Mar 02, 2023 at 03:47:53PM +0530, Amit Kapila wrote:

Why don't we try to support the direct upgrade of logical replication
nodes? Have you tried to analyze what are the obstacles and whether we
can have solutions for those? For example, one of the challenges is to
support the upgrade of slots, can we copy (from the old cluster) and
recreate them in the new cluster by resetting LSNs? We can also reset
origins during the upgrade of subscribers and recommend to first
upgrade the subscriber node.

I'm not sure I get your question. This whole thread is about direct upgrade of
logical replication nodes, at least the subscribers, and what is currently
preventing it.

It is only about subscribers and nothing about publishers.

For the publisher nodes, that may be something nice to support (I'm assuming it
could be useful for more complex replication setups) but I'm not interested in
that at the moment as my goal is to reduce downtime for major upgrade of
physical replica, thus *not* doing pg_upgrade of the primary node, whether
physical or logical. I don't see why it couldn't be done later on, if/when
someone has a use case for it.

I thought there is value if we provide a way to upgrade both publisher
and subscriber. Now, you came up with a use case linking it to a
physical replica where allowing an upgrade of only subscriber nodes is
useful. It is possible that users find your steps easy to perform and
didn't find them error-prone but it may be better to get some
authentication of the same. I haven't yet analyzed all the steps in
detail but let's see what others think.

--
With Regards,
Amit Kapila.

#26Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#25)
Re: pg_upgrade and logical replication

On Sat, 4 Mar 2023, 14:13 Amit Kapila, <amit.kapila16@gmail.com> wrote:

For the publisher nodes, that may be something nice to support (I'm

assuming it

could be useful for more complex replication setups) but I'm not

interested in

that at the moment as my goal is to reduce downtime for major upgrade of
physical replica, thus *not* doing pg_upgrade of the primary node,

whether

physical or logical. I don't see why it couldn't be done later on,

if/when

someone has a use case for it.

I thought there is value if we provide a way to upgrade both publisher
and subscriber.

it's still unclear to me whether it's actually achievable on the publisher
side, as running pg_upgrade leaves a "hole" in the WAL stream and resets
the timeline, among other possible difficulties. Now I don't know much
about logical replication internals so I'm clearly not the best person to
answer those questions.

Now, you came up with a use case linking it to a

physical replica where allowing an upgrade of only subscriber nodes is
useful. It is possible that users find your steps easy to perform and
didn't find them error-prone but it may be better to get some
authentication of the same. I haven't yet analyzed all the steps in
detail but let's see what others think.

It's been quite some time since and no one seemed to chime in or object.
IMO doing a major version upgrade with limited downtime (so something
faster than stopping postgres and running pg_upgrade) has always been
difficult and never prevented anyone from doing it, so I don't think that
it should be a blocker for what I'm suggesting here, especially since the
current behavior of pg_upgrade on a subscriber node is IMHO broken.

Is there something that can be done for pg16? I was thinking that having a
fix for the normal and easy case could be acceptable: only allowing
pg_upgrade to optionally, and not by default, preserve the subscription
relations IFF all subscriptions only have tables in ready state. Different
states should be transient, and it's easy to check as a user beforehand and
also easy to check during pg_upgrade, so it seems like an acceptable
limitations (which I personally see as a good sanity check, but YMMV). It
could be lifted in later releases if wanted anyway.

It's unclear to me whether this limited scope would also require to
preserve the replication origins, but having looked at the code I don't
think it would be much of a problem as the local LSN doesn't have to be
preserved. In both cases I would prefer a single option (e. g.
--preserve-logical-subscription-state or something like that) to avoid too
much complications. Similarly, I still don't see any sensible use case for
allowing such option in a normal pg_dump so I'd rather not expose that.

Show quoted text
#27Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#26)
Re: pg_upgrade and logical replication

On Wed, Mar 8, 2023 at 12:26 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Sat, 4 Mar 2023, 14:13 Amit Kapila, <amit.kapila16@gmail.com> wrote:

For the publisher nodes, that may be something nice to support (I'm assuming it
could be useful for more complex replication setups) but I'm not interested in
that at the moment as my goal is to reduce downtime for major upgrade of
physical replica, thus *not* doing pg_upgrade of the primary node, whether
physical or logical. I don't see why it couldn't be done later on, if/when
someone has a use case for it.

I thought there is value if we provide a way to upgrade both publisher
and subscriber.

it's still unclear to me whether it's actually achievable on the publisher side, as running pg_upgrade leaves a "hole" in the WAL stream and resets the timeline, among other possible difficulties. Now I don't know much about logical replication internals so I'm clearly not the best person to answer those questions.

I think that is the part we need to analyze and see what are the
challenges there. One part of the challenge is that we need to
preserve slots that have some WAL locations like restart_lsn,
confirmed_flush and we need WAL from those locations for decoding. I
haven't analyzed this but isn't it possible to that on clean shutdown
we confirm that all the WAL has been sent and confirmed by the logical
subscriber in which case I think truncating WAL in pg_upgrade
shouldn't be a problem?

Now, you came up with a use case linking it to a
physical replica where allowing an upgrade of only subscriber nodes is
useful. It is possible that users find your steps easy to perform and
didn't find them error-prone but it may be better to get some
authentication of the same. I haven't yet analyzed all the steps in
detail but let's see what others think.

It's been quite some time since and no one seemed to chime in or object. IMO doing a major version upgrade with limited downtime (so something faster than stopping postgres and running pg_upgrade) has always been difficult and never prevented anyone from doing it, so I don't think that it should be a blocker for what I'm suggesting here, especially since the current behavior of pg_upgrade on a subscriber node is IMHO broken.

Is there something that can be done for pg16? I was thinking that having a fix for the normal and easy case could be acceptable: only allowing pg_upgrade to optionally, and not by default, preserve the subscription relations IFF all subscriptions only have tables in ready state. Different states should be transient, and it's easy to check as a user beforehand and also easy to check during pg_upgrade, so it seems like an acceptable limitations (which I personally see as a good sanity check, but YMMV). It could be lifted in later releases if wanted anyway.

It's unclear to me whether this limited scope would also require to preserve the replication origins, but having looked at the code I don't think it would be much of a problem as the local LSN doesn't have to be preserved.

I think we need to preserve replication origins as they help us to
determine the WAL location from where to start the streaming after the
upgrade. If we don't preserve those then from which location will the
subscriber start streaming? We don't want to replicate the WAL which
has already been sent.

--
With Regards,
Amit Kapila.

#28Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#27)
1 attachment(s)
Re: pg_upgrade and logical replication

Hi,

On Thu, Mar 09, 2023 at 12:05:36PM +0530, Amit Kapila wrote:

On Wed, Mar 8, 2023 at 12:26 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Is there something that can be done for pg16? I was thinking that having a
fix for the normal and easy case could be acceptable: only allowing
pg_upgrade to optionally, and not by default, preserve the subscription
relations IFF all subscriptions only have tables in ready state. Different
states should be transient, and it's easy to check as a user beforehand and
also easy to check during pg_upgrade, so it seems like an acceptable
limitations (which I personally see as a good sanity check, but YMMV). It
could be lifted in later releases if wanted anyway.

It's unclear to me whether this limited scope would also require to
preserve the replication origins, but having looked at the code I don't
think it would be much of a problem as the local LSN doesn't have to be
preserved.

I think we need to preserve replication origins as they help us to
determine the WAL location from where to start the streaming after the
upgrade. If we don't preserve those then from which location will the
subscriber start streaming?

It would start from the slot's information on the publisher side, but I guess
there's no guarantee that this will be accurate in all cases.

We don't want to replicate the WAL which
has already been sent.

Yeah I agree. I added support to also preserve the subscription's replication
origin information, a new --preserve-subscription-state (better naming welcome)
documented option for pg_upgrade to optionally ask for this new mode, and a
similar (but undocumented) option for pg_dump that only works with
--binary-upgrade and added a check in pg_upgrade that all relations are in 'r'
(ready) mode. Patch v2 attached.

Attachments:

v2-0001-Optionally-preserve-the-full-subscription-s-state.patchtext/plain; charset=us-asciiDownload
From 0a77ac305243e0f58dbfce6bb7c8cf062b45d4f4 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Wed, 22 Feb 2023 09:19:32 +0800
Subject: [PATCH v2] Optionally preserve the full subscription's state during
 pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

Similarly, the subscription's replication origin are needed to ensure
that we don't replicate anything twice.

To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
additional commands to be able to restore the content of pg_subscription_rel,
and addition LSN parameter in the subscription creation to restore the
underlying replication origin remote LSN.  The LSN parameter is only accepted
in CREATE SUBSCRIPTION in binary upgrade mode.

The new ALTER SUBSCRIPTION subcommand, usable only during binary upgrade, has
the following syntax:

ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

The relation is identified by its oid, as it's preserved during pg_upgrade.
The lsn is optional, and defaults to NULL / InvalidXLogRecPtr.

This mode is optional and not enabled by default.  A new
--preserve-subscription-state option is added to pg_upgrade to use it.  For
now, pg_upgrade will check that all the subscription relations are in 'r'
(ready) state, and will error out if any subscription relation in any database
has a different state, logging the list of problematic databases with the
number of problematic relation in each.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml         |  13 +++
 src/backend/commands/subscriptioncmds.c |  67 +++++++++++++-
 src/backend/parser/gram.y               |  11 +++
 src/bin/pg_dump/pg_backup.h             |   2 +
 src/bin/pg_dump/pg_dump.c               | 114 +++++++++++++++++++++++-
 src/bin/pg_dump/pg_dump.h               |  13 +++
 src/bin/pg_upgrade/check.c              |  54 +++++++++++
 src/bin/pg_upgrade/dump.c               |   3 +-
 src/bin/pg_upgrade/option.c             |   7 ++
 src/bin/pg_upgrade/pg_upgrade.h         |   1 +
 src/include/nodes/parsenodes.h          |   3 +-
 11 files changed, 283 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 7816b4c685..aef3b8a8b8 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -240,6 +240,19 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--preserve-subscription-state</option></term>
+      <listitem>
+       <para>
+        Fully preserve the logical subscription state if any.  That include the
+        underlying replication origin with their remote LSN and the list of
+        relations in each subscription.  If any of the subscription on the old
+        cluster has any relation in a state different from <literal>r</literal>
+        (ready), the <application>pg_upgrade</application> run will error.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-?</option></term>
       <term><option>--help</option></term>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..75278991e4 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,8 @@
 #define SUBOPT_DISABLE_ON_ERR		0x00000400
 #define SUBOPT_LSN					0x00000800
 #define SUBOPT_ORIGIN				0x00001000
+#define SUBOPT_RELID				0x00002000
+#define SUBOPT_STATE				0x00004000
 
 /* check if the 'val' has 'bits' set */
 #define IsSet(val, bits)  (((val) & (bits)) == (bits))
@@ -90,6 +92,8 @@ typedef struct SubOpts
 	bool		disableonerr;
 	char	   *origin;
 	XLogRecPtr	lsn;
+	Oid			relid;
+	char		state;
 } SubOpts;
 
 static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -324,6 +328,38 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
 			opts->specified_opts |= SUBOPT_LSN;
 			opts->lsn = lsn;
 		}
+		else if (IsSet(supported_opts, SUBOPT_RELID) &&
+				 strcmp(defel->defname, "relid") == 0)
+		{
+			Oid			relid = defGetObjectId(defel);
+
+			if (IsSet(opts->specified_opts, SUBOPT_RELID))
+				errorConflictingDefElem(defel, pstate);
+
+			if (!OidIsValid(relid))
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("invalid relation identifier used")));
+
+			opts->specified_opts |= SUBOPT_RELID;
+			opts->relid = relid;
+		}
+		else if (IsSet(supported_opts, SUBOPT_STATE) &&
+				 strcmp(defel->defname, "state") == 0)
+		{
+			char	   *state_str = defGetString(defel);
+
+			if (IsSet(opts->specified_opts, SUBOPT_STATE))
+				errorConflictingDefElem(defel, pstate);
+
+			if (strlen(state_str) != 1)
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("invalid relation state used")));
+
+			opts->specified_opts |= SUBOPT_STATE;
+			opts->state = defGetString(defel)[0];
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -550,6 +586,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	List	   *publications;
 	bits32		supported_opts;
 	SubOpts		opts = {0};
+	RepOriginId	originid;
 
 	/*
 	 * Parse and check options.
@@ -561,6 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 					  SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
 					  SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
 					  SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+	if(IsBinaryUpgrade)
+		supported_opts |= SUBOPT_LSN;
 	parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
 
 	/*
@@ -659,7 +698,12 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	recordDependencyOnOwner(SubscriptionRelationId, subid, owner);
 
 	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
-	replorigin_create(originname);
+	originid = replorigin_create(originname);
+
+	if (IsBinaryUpgrade && IsSet(opts.lsn, SUBOPT_LSN))
+		replorigin_advance(originid, opts.lsn, InvalidXLogRecPtr,
+							false /* backward */ ,
+							false /* WAL log */ );
 
 	/*
 	 * Connect to remote side to execute requested commands and fetch table
@@ -1341,6 +1385,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 				break;
 			}
 
+		case ALTER_SUBSCRIPTION_ADD_TABLE:
+			{
+				if (!IsBinaryUpgrade)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR)),
+							errmsg("ALTER SUBSCRIPTION ... ADD TABLE is not supported"));
+
+				supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN;
+				parse_subscription_options(pstate, stmt->options,
+										   supported_opts, &opts);
+
+				/* relid and state should always be provided. */
+				Assert(IsSet(opts.specified_opts, SUBOPT_RELID));
+				Assert(IsSet(opts.specified_opts, SUBOPT_STATE));
+
+				AddSubscriptionRelState(subid, opts.relid, opts.state,
+										opts.lsn);
+
+				break;
+			}
+
 		default:
 			elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
 				 stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a0138382a1..0a3448c487 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10670,6 +10670,17 @@ AlterSubscriptionStmt:
 					n->options = $5;
 					$$ = (Node *) n;
 				}
+			/* for binary upgrade only */
+			| ALTER SUBSCRIPTION name ADD_P TABLE definition
+				{
+					AlterSubscriptionStmt *n =
+						makeNode(AlterSubscriptionStmt);
+
+					n->kind = ALTER_SUBSCRIPTION_ADD_TABLE;
+					n->subname = $3;
+					n->options = $6;
+					$$ = (Node *) n;
+				}
 		;
 
 /*****************************************************************************
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index aba780ef4b..8a72a39d60 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -200,6 +200,8 @@ typedef struct _dumpOptions
 
 	int			sequence_data;	/* dump sequence data even in schema-only mode */
 	int			do_nothing;
+
+	bool		preserve_subscriptions;
 } DumpOptions;
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4217908f84..c6499a3d24 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -421,6 +421,7 @@ main(int argc, char **argv)
 		{"on-conflict-do-nothing", no_argument, &dopt.do_nothing, 1},
 		{"rows-per-insert", required_argument, NULL, 10},
 		{"include-foreign-data", required_argument, NULL, 11},
+		{"preserve-subscription-state", no_argument, NULL, 12},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -631,6 +632,10 @@ main(int argc, char **argv)
 										  optarg);
 				break;
 
+			case 12:			/* include full subscription state */
+				dopt.preserve_subscriptions = true;
+				break;
+
 			default:
 				/* getopt_long already emitted a complaint */
 				pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -688,6 +693,10 @@ main(int argc, char **argv)
 	if (dopt.do_nothing && dopt.dump_inserts == 0)
 		pg_fatal("option --on-conflict-do-nothing requires option --inserts, --rows-per-insert, or --column-inserts");
 
+	/* --preserve-subscription-state requires --binary-upgrade */
+	if (dopt.preserve_subscriptions && !dopt.binary_upgrade)
+		pg_fatal("option --preserve-subscription-state requires option --binary-upgrade");
+
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
 
@@ -4485,6 +4494,69 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionRels
+ *	  get information about the given subscription's relations
+ */
+static SubRelInfo *
+getSubscriptionRels(Archive *fout, Oid subid, int *nrels)
+{
+	SubRelInfo *rels;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i,
+				ntups;
+
+	if (!fout->dopt->binary_upgrade || !fout->dopt->preserve_subscriptions)
+	{
+		*nrels = 0;
+
+		return NULL;
+	}
+
+	query = createPQExpBuffer();
+
+	appendPQExpBuffer(query, "SELECT srrelid, srsubstate, srsublsn "
+								" FROM pg_subscription_rel"
+								" WHERE srsubid = %u", subid);
+
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	*nrels = ntups;
+
+	if (ntups == 0)
+	{
+		rels = NULL;
+		goto cleanup;
+	}
+
+	/*
+	 * Get subscription relation fields.
+	 */
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	rels = pg_malloc(ntups * sizeof(SubRelInfo));
+
+	for (i = 0; i < ntups; i++)
+	{
+		rels[i].srrelid = atooid(PQgetvalue(res, i, i_srrelid));
+		rels[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		rels[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+
+	return rels;
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4509,6 +4581,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_subbinary;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4561,12 +4634,16 @@ getSubscriptions(Archive *fout)
 						  LOGICALREP_TWOPHASE_STATE_DISABLED);
 
 	if (fout->remoteVersion >= 160000)
-		appendPQExpBufferStr(query, " s.suborigin\n");
+		appendPQExpBufferStr(query, " s.suborigin,\n");
 	else
-		appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+		appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+
+	appendPQExpBufferStr(query, "o.remote_lsn\n");
 
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n"
+						 "LEFT JOIN pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4591,6 +4668,7 @@ getSubscriptions(Archive *fout)
 	i_subtwophasestate = PQfnumber(res, "subtwophasestate");
 	i_subdisableonerr = PQfnumber(res, "subdisableonerr");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "remote_lsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4621,6 +4699,15 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subdisableonerr =
 			pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+
+		subinfo[i].subrels = getSubscriptionRels(fout,
+												 subinfo[i].dobj.catId.oid,
+												 &subinfo[i].nrels);
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4702,9 +4789,31 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 	if (strcmp(subinfo->subsynccommit, "off") != 0)
 		appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
 
+	if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
+	}
+
 	appendPQExpBufferStr(query, ");\n");
 
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		for (i = 0; i < subinfo->nrels; i++)
+		{
+			appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
+									 "(relid = %u, state = '%c'",
+									 qsubname,
+									 subinfo->subrels[i].srrelid,
+									 subinfo->subrels[i].srsubstate);
+
+			if (subinfo->subrels[i].srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", LSN = '%s'",
+								  subinfo->subrels[i].srsublsn);
+
+			appendPQExpBufferStr(query, ");");
+		}
+
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
 								  .owner = subinfo->rolname,
@@ -4712,6 +4821,7 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 								  .section = SECTION_POST_DATA,
 								  .createStmt = query->data,
 								  .dropStmt = delq->data));
+	}
 
 	if (subinfo->dobj.dump & DUMP_COMPONENT_COMMENT)
 		dumpComment(fout, "SUBSCRIPTION", qsubname,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index cdca0b993d..43ab4acf35 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -645,6 +645,16 @@ typedef struct _PublicationSchemaInfo
 	NamespaceInfo *pubschema;
 } PublicationSchemaInfo;
 
+/*
+ * The SubRelInfo struct is used to represent subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	Oid		srrelid;
+	char	srsubstate;
+	char   *srsublsn;
+} SubRelInfo;
+
 /*
  * The SubscriptionInfo struct is used to represent subscription.
  */
@@ -661,6 +671,9 @@ typedef struct _SubscriptionInfo
 	char	   *suborigin;
 	char	   *subsynccommit;
 	char	   *subpublications;
+	char	   *suboriginremotelsn;
+	int			nrels;
+	SubRelInfo *subrels;
 } SubscriptionInfo;
 
 /*
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 7cf68dc9af..74806bf2cc 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -23,6 +23,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_rels(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -107,6 +108,8 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_composite_data_type_usage(&old_cluster);
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
+	if (user_opts.preserve_subscriptions)
+		check_for_subscription_rels(&old_cluster);
 
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the on-disk
@@ -907,6 +910,57 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_rels()
+ *
+ * Verify that no table in a subscription is in a state different than ready.
+ */
+static void
+check_for_subscription_rels(ClusterInfo *cluster)
+{
+	int			dbnum;
+	bool		is_error = false;
+
+	Assert(user_opts.preserve_subscriptions);
+
+	/* No subscription before pg10. */
+	if (GET_MAJOR_VERSION(cluster->major_version < 1000))
+		return;
+
+	prep_status("Checking for non-ready subscription relations");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		int			nb;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		res = executeQueryOrDie(conn,
+								"SELECT count(0) "
+								"FROM pg_catalog.pg_subscription_rel "
+								"WHERE srsubstate != 'r'");
+
+		if (PQntuples(res) != 1)
+			pg_fatal("could not determine the number of non-ready subscription relations");
+
+		nb = atooid(PQgetvalue(res, 0, 0));
+		if (nb != 0)
+		{
+			is_error = true;
+			pg_log(PG_WARNING,
+				   "\nWARNING: database \"%s\" has %d subscription "
+				   "relations(s) in non-ready state", active_db->db_name, nb);
+		}
+	}
+
+	if (is_error)
+		pg_fatal("--preserve-subscription-state is incompatible with "
+				"subscription relations in non-ready state");
+
+	check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/dump.c b/src/bin/pg_upgrade/dump.c
index 6c8c82dca8..9284576af7 100644
--- a/src/bin/pg_upgrade/dump.c
+++ b/src/bin/pg_upgrade/dump.c
@@ -53,9 +53,10 @@ generate_old_dump(void)
 
 		parallel_exec_prog(log_file_name, NULL,
 						   "\"%s/pg_dump\" %s --schema-only --quote-all-identifiers "
-						   "--binary-upgrade --format=custom %s --file=\"%s/%s\" %s",
+						   "--binary-upgrade --format=custom %s %s --file=\"%s/%s\" %s",
 						   new_cluster.bindir, cluster_conn_opts(&old_cluster),
 						   log_opts.verbose ? "--verbose" : "",
+						   user_opts.preserve_subscriptions ? "--preserve-subscription-state" : "",
 						   log_opts.dumpdir,
 						   sql_file_name, escaped_connstr.data);
 
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index 8869b6b60d..b033aa26ba 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -57,6 +57,7 @@ parseCommandLine(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"clone", no_argument, NULL, 1},
 		{"copy", no_argument, NULL, 2},
+		{"preserve-subscription-state", no_argument, NULL, 3},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -66,6 +67,7 @@ parseCommandLine(int argc, char *argv[])
 
 	user_opts.do_sync = true;
 	user_opts.transfer_mode = TRANSFER_MODE_COPY;
+	user_opts.preserve_subscriptions = false;
 
 	os_info.progname = get_progname(argv[0]);
 
@@ -199,6 +201,10 @@ parseCommandLine(int argc, char *argv[])
 				user_opts.transfer_mode = TRANSFER_MODE_COPY;
 				break;
 
+			case 3:
+				user_opts.preserve_subscriptions = true;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
 						os_info.progname);
@@ -289,6 +295,7 @@ usage(void)
 	printf(_("  -V, --version                 display version information, then exit\n"));
 	printf(_("  --clone                       clone instead of copying files to new cluster\n"));
 	printf(_("  --copy                        copy files to new cluster (default)\n"));
+	printf(_("  --preserve-subscription-state preserve the subscription state fully\n"));
 	printf(_("  -?, --help                    show this help, then exit\n"));
 	printf(_("\n"
 			 "Before running pg_upgrade you must:\n"
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 5f2a116f23..e0d44e41e3 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -296,6 +296,7 @@ typedef struct
 	transferMode transfer_mode; /* copy files or link them? */
 	int			jobs;			/* number of processes/threads to use */
 	char	   *socketdir;		/* directory to use for Unix sockets */
+	bool		preserve_subscriptions; /* fully transfer subscription state */
 } UserOpts;
 
 typedef struct
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 371aa0ffc5..d441fccd5e 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3920,7 +3920,8 @@ typedef enum AlterSubscriptionType
 	ALTER_SUBSCRIPTION_DROP_PUBLICATION,
 	ALTER_SUBSCRIPTION_REFRESH,
 	ALTER_SUBSCRIPTION_ENABLED,
-	ALTER_SUBSCRIPTION_SKIP
+	ALTER_SUBSCRIPTION_SKIP,
+	ALTER_SUBSCRIPTION_ADD_TABLE
 } AlterSubscriptionType;
 
 typedef struct AlterSubscriptionStmt
-- 
2.37.0

#29Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Julien Rouhaud (#20)
Re: pg_upgrade and logical replication

On Wed, Mar 1, 2023 at 3:55 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Wed, Mar 01, 2023 at 11:51:49AM +0530, Amit Kapila wrote:

On Tue, Feb 28, 2023 at 10:18 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

Well, as I mentioned I'm *not* interested in a logical-replication-only
scenario. Logical replication is nice but it will always be less efficient
than physical replication, and some workloads also don't really play well with
it. So while it can be a huge asset in some cases I'm for now looking at
leveraging logical replication for the purpose of major upgrade only for a
physical replication cluster, so the publications and subscriptions are only
temporary and trashed after use.

That being said I was only saying that if I had to do a major upgrade of a
logical replication cluster this is probably how I would try to do it, to
minimize downtime, even if there are probably *a lot* difficulties to
overcome.

Okay, but it would be better if you list out your detailed steps. It
would be useful to support the new mechanism in this area if others
also find your steps to upgrade useful.

Sure. Here are the overly detailed steps:

1) setup a normal physical replication cluster (pg_basebackup, restoring PITR,
whatever), let's call the primary node "A" and replica node "B"
2) ensure WAL level is "logical" on the primary node A
3) create a logical replication slot on every (connectable) database (or just
the one you're interested in if you don't want to preserve everything) on A
4) create a FOR ALL TABLE publication (again for every databases or just the
one you're interested in)
5) wait for replication to be reasonably if not entirely up to date
6) promote the standby node B
7) retrieve the promotion LSN (from the XXXXXXXX.history file,
pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn()...)
8) call pg_replication_slot_advance() with that LSN for all previously created
logical replication slots on A
9) create a normal subscription on all wanted databases on the promoted node
10) wait for it to catchup if needed on B
12) stop the node B
13) run pg_upgrade on B, creating the new node C
14) start C, run the global ANALYZE and any sanity check needed (hopefully you
would have validated that your application is compatible with that new
version before this point)

I might be missing something but is there any reason why you created a
subscription before pg_upgrade?

Steps like doing pg_upgrade, then creating missing tables, and then
creating a subscription (with copy_data = false) could be an
alternative way to support upgrading the server from the physical
standby?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#30Julien Rouhaud
rjuju123@gmail.com
In reply to: Masahiko Sawada (#29)
Re: pg_upgrade and logical replication

Hi,

On Thu, Mar 23, 2023 at 04:27:28PM +0900, Masahiko Sawada wrote:

I might be missing something but is there any reason why you created a
subscription before pg_upgrade?

Steps like doing pg_upgrade, then creating missing tables, and then
creating a subscription (with copy_data = false) could be an
alternative way to support upgrading the server from the physical
standby?

As I already answered to Nikolay, and explained in my very first email, yes
it's possible to create the subscriptions after running pg_upgrade. I
personally prefer to do it first to make sure that the logical replication is
actually functional, so I can still easily do a pg_rewind or something to fix
things without having to trash the newly built (and promoted) replica.

But that exact scenario is a corner case, as in any other scenario pg_upgrade
leaves the subscription in an unrecoverable state, where you have to truncate
all the underlying tables first and start from scratch doing an initial sync.
This kind of defeats the purpose of pg_upgrade.

#31Julien Rouhaud
rjuju123@gmail.com
In reply to: Julien Rouhaud (#28)
1 attachment(s)
Re: pg_upgrade and logical replication

Hi,

On Thu, Mar 09, 2023 at 04:34:56PM +0800, Julien Rouhaud wrote:

Yeah I agree. I added support to also preserve the subscription's replication
origin information, a new --preserve-subscription-state (better naming welcome)
documented option for pg_upgrade to optionally ask for this new mode, and a
similar (but undocumented) option for pg_dump that only works with
--binary-upgrade and added a check in pg_upgrade that all relations are in 'r'
(ready) mode. Patch v2 attached.

I'm attaching a v3 to fix a recent conflict with pg_dump due to a563c24c9574b7
(Allow pg_dump to include/exclude child tables automatically). While at it I
also tried to improve the documentation, explaining how that option could be
useful and what is the drawback of not using it (linking to the pg_dump note
about the same) if you plan to reactivate subscription(s) after an upgrade.

Attachments:

v3-0001-Optionally-preserve-the-full-subscription-s-state.patchtext/plain; charset=us-asciiDownload
From 3a17a292805451c7b1733bd1e331bee91b2ce1c5 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Wed, 22 Feb 2023 09:19:32 +0800
Subject: [PATCH v3] Optionally preserve the full subscription's state during
 pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

Similarly, the subscription's replication origin are needed to ensure
that we don't replicate anything twice.

To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
additional commands to be able to restore the content of pg_subscription_rel,
and addition LSN parameter in the subscription creation to restore the
underlying replication origin remote LSN.  The LSN parameter is only accepted
in CREATE SUBSCRIPTION in binary upgrade mode.

The new ALTER SUBSCRIPTION subcommand, usable only during binary upgrade, has
the following syntax:

ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

The relation is identified by its oid, as it's preserved during pg_upgrade.
The lsn is optional, and defaults to NULL / InvalidXLogRecPtr.

This mode is optional and not enabled by default.  A new
--preserve-subscription-state option is added to pg_upgrade to use it.  For
now, pg_upgrade will check that all the subscription relations are in 'r'
(ready) state, and will error out if any subscription relation in any database
has a different state, logging the list of problematic databases with the
number of problematic relation in each.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml         |  18 ++++
 src/backend/commands/subscriptioncmds.c |  67 +++++++++++++-
 src/backend/parser/gram.y               |  11 +++
 src/bin/pg_dump/pg_backup.h             |   2 +
 src/bin/pg_dump/pg_dump.c               | 114 +++++++++++++++++++++++-
 src/bin/pg_dump/pg_dump.h               |  13 +++
 src/bin/pg_upgrade/check.c              |  54 +++++++++++
 src/bin/pg_upgrade/dump.c               |   3 +-
 src/bin/pg_upgrade/option.c             |   7 ++
 src/bin/pg_upgrade/pg_upgrade.h         |   1 +
 src/include/nodes/parsenodes.h          |   3 +-
 11 files changed, 288 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 7816b4c685..0b3a8fd57b 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -240,6 +240,24 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--preserve-subscription-state</option></term>
+      <listitem>
+       <para>
+        Fully preserve the logical subscription state if any.  That includes
+        the underlying replication origin with their remote LSN and the list of
+        relations in each subscription so that replication can be simply
+        resumed if the subscriptions are reactived.
+        If that option isn't used, it is up to the user to reactivate the
+        subscriptions in a suitable way; see the subscription part in <xref
+        linkend="pg-dump-notes"/> for more information.
+        If this option is used and any of the subscription on the old cluster
+        has any relation in a state different from <literal>r</literal>
+        (ready), the <application>pg_upgrade</application> run will error.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-?</option></term>
       <term><option>--help</option></term>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 8a26ddab1c..9e9d011c06 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,8 @@
 #define SUBOPT_DISABLE_ON_ERR		0x00000400
 #define SUBOPT_LSN					0x00000800
 #define SUBOPT_ORIGIN				0x00001000
+#define SUBOPT_RELID				0x00002000
+#define SUBOPT_STATE				0x00004000
 
 /* check if the 'val' has 'bits' set */
 #define IsSet(val, bits)  (((val) & (bits)) == (bits))
@@ -90,6 +92,8 @@ typedef struct SubOpts
 	bool		disableonerr;
 	char	   *origin;
 	XLogRecPtr	lsn;
+	Oid			relid;
+	char		state;
 } SubOpts;
 
 static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -324,6 +328,38 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
 			opts->specified_opts |= SUBOPT_LSN;
 			opts->lsn = lsn;
 		}
+		else if (IsSet(supported_opts, SUBOPT_RELID) &&
+				 strcmp(defel->defname, "relid") == 0)
+		{
+			Oid			relid = defGetObjectId(defel);
+
+			if (IsSet(opts->specified_opts, SUBOPT_RELID))
+				errorConflictingDefElem(defel, pstate);
+
+			if (!OidIsValid(relid))
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("invalid relation identifier used")));
+
+			opts->specified_opts |= SUBOPT_RELID;
+			opts->relid = relid;
+		}
+		else if (IsSet(supported_opts, SUBOPT_STATE) &&
+				 strcmp(defel->defname, "state") == 0)
+		{
+			char	   *state_str = defGetString(defel);
+
+			if (IsSet(opts->specified_opts, SUBOPT_STATE))
+				errorConflictingDefElem(defel, pstate);
+
+			if (strlen(state_str) != 1)
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("invalid relation state used")));
+
+			opts->specified_opts |= SUBOPT_STATE;
+			opts->state = defGetString(defel)[0];
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -550,6 +586,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	List	   *publications;
 	bits32		supported_opts;
 	SubOpts		opts = {0};
+	RepOriginId	originid;
 
 	/*
 	 * Parse and check options.
@@ -561,6 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 					  SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
 					  SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
 					  SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+	if(IsBinaryUpgrade)
+		supported_opts |= SUBOPT_LSN;
 	parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
 
 	/*
@@ -659,7 +698,12 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	recordDependencyOnOwner(SubscriptionRelationId, subid, owner);
 
 	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
-	replorigin_create(originname);
+	originid = replorigin_create(originname);
+
+	if (IsBinaryUpgrade && IsSet(opts.lsn, SUBOPT_LSN))
+		replorigin_advance(originid, opts.lsn, InvalidXLogRecPtr,
+							false /* backward */ ,
+							false /* WAL log */ );
 
 	/*
 	 * Connect to remote side to execute requested commands and fetch table
@@ -1341,6 +1385,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 				break;
 			}
 
+		case ALTER_SUBSCRIPTION_ADD_TABLE:
+			{
+				if (!IsBinaryUpgrade)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR)),
+							errmsg("ALTER SUBSCRIPTION ... ADD TABLE is not supported"));
+
+				supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN;
+				parse_subscription_options(pstate, stmt->options,
+										   supported_opts, &opts);
+
+				/* relid and state should always be provided. */
+				Assert(IsSet(opts.specified_opts, SUBOPT_RELID));
+				Assert(IsSet(opts.specified_opts, SUBOPT_STATE));
+
+				AddSubscriptionRelState(subid, opts.relid, opts.state,
+										opts.lsn);
+
+				break;
+			}
+
 		default:
 			elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
 				 stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index efe88ccf9d..43e8039a68 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10670,6 +10670,17 @@ AlterSubscriptionStmt:
 					n->options = $5;
 					$$ = (Node *) n;
 				}
+			/* for binary upgrade only */
+			| ALTER SUBSCRIPTION name ADD_P TABLE definition
+				{
+					AlterSubscriptionStmt *n =
+						makeNode(AlterSubscriptionStmt);
+
+					n->kind = ALTER_SUBSCRIPTION_ADD_TABLE;
+					n->subname = $3;
+					n->options = $6;
+					$$ = (Node *) n;
+				}
 		;
 
 /*****************************************************************************
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index aba780ef4b..8a72a39d60 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -200,6 +200,8 @@ typedef struct _dumpOptions
 
 	int			sequence_data;	/* dump sequence data even in schema-only mode */
 	int			do_nothing;
+
+	bool		preserve_subscriptions;
 } DumpOptions;
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index d62780a088..d949f4b72d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -430,6 +430,7 @@ main(int argc, char **argv)
 		{"table-and-children", required_argument, NULL, 12},
 		{"exclude-table-and-children", required_argument, NULL, 13},
 		{"exclude-table-data-and-children", required_argument, NULL, 14},
+		{"preserve-subscription-state", no_argument, NULL, 15},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -656,6 +657,10 @@ main(int argc, char **argv)
 										  optarg);
 				break;
 
+			case 15:			/* include full subscription state */
+				dopt.preserve_subscriptions = true;
+				break;
+
 			default:
 				/* getopt_long already emitted a complaint */
 				pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -713,6 +718,10 @@ main(int argc, char **argv)
 	if (dopt.do_nothing && dopt.dump_inserts == 0)
 		pg_fatal("option --on-conflict-do-nothing requires option --inserts, --rows-per-insert, or --column-inserts");
 
+	/* --preserve-subscription-state requires --binary-upgrade */
+	if (dopt.preserve_subscriptions && !dopt.binary_upgrade)
+		pg_fatal("option --preserve-subscription-state requires option --binary-upgrade");
+
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
 
@@ -4584,6 +4593,69 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionRels
+ *	  get information about the given subscription's relations
+ */
+static SubRelInfo *
+getSubscriptionRels(Archive *fout, Oid subid, int *nrels)
+{
+	SubRelInfo *rels;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i,
+				ntups;
+
+	if (!fout->dopt->binary_upgrade || !fout->dopt->preserve_subscriptions)
+	{
+		*nrels = 0;
+
+		return NULL;
+	}
+
+	query = createPQExpBuffer();
+
+	appendPQExpBuffer(query, "SELECT srrelid, srsubstate, srsublsn "
+								" FROM pg_subscription_rel"
+								" WHERE srsubid = %u", subid);
+
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	*nrels = ntups;
+
+	if (ntups == 0)
+	{
+		rels = NULL;
+		goto cleanup;
+	}
+
+	/*
+	 * Get subscription relation fields.
+	 */
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	rels = pg_malloc(ntups * sizeof(SubRelInfo));
+
+	for (i = 0; i < ntups; i++)
+	{
+		rels[i].srrelid = atooid(PQgetvalue(res, i, i_srrelid));
+		rels[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		rels[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+
+	return rels;
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4608,6 +4680,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_subbinary;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4660,12 +4733,16 @@ getSubscriptions(Archive *fout)
 						  LOGICALREP_TWOPHASE_STATE_DISABLED);
 
 	if (fout->remoteVersion >= 160000)
-		appendPQExpBufferStr(query, " s.suborigin\n");
+		appendPQExpBufferStr(query, " s.suborigin,\n");
 	else
-		appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+		appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+
+	appendPQExpBufferStr(query, "o.remote_lsn\n");
 
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n"
+						 "LEFT JOIN pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4690,6 +4767,7 @@ getSubscriptions(Archive *fout)
 	i_subtwophasestate = PQfnumber(res, "subtwophasestate");
 	i_subdisableonerr = PQfnumber(res, "subdisableonerr");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "remote_lsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4720,6 +4798,15 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subdisableonerr =
 			pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+
+		subinfo[i].subrels = getSubscriptionRels(fout,
+												 subinfo[i].dobj.catId.oid,
+												 &subinfo[i].nrels);
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4801,9 +4888,31 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 	if (strcmp(subinfo->subsynccommit, "off") != 0)
 		appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
 
+	if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
+	}
+
 	appendPQExpBufferStr(query, ");\n");
 
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		for (i = 0; i < subinfo->nrels; i++)
+		{
+			appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
+									 "(relid = %u, state = '%c'",
+									 qsubname,
+									 subinfo->subrels[i].srrelid,
+									 subinfo->subrels[i].srsubstate);
+
+			if (subinfo->subrels[i].srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", LSN = '%s'",
+								  subinfo->subrels[i].srsublsn);
+
+			appendPQExpBufferStr(query, ");");
+		}
+
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
 								  .owner = subinfo->rolname,
@@ -4811,6 +4920,7 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 								  .section = SECTION_POST_DATA,
 								  .createStmt = query->data,
 								  .dropStmt = delq->data));
+	}
 
 	if (subinfo->dobj.dump & DUMP_COMPONENT_COMMENT)
 		dumpComment(fout, "SUBSCRIPTION", qsubname,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 283cd1a602..fa5dd41541 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -647,6 +647,16 @@ typedef struct _PublicationSchemaInfo
 	NamespaceInfo *pubschema;
 } PublicationSchemaInfo;
 
+/*
+ * The SubRelInfo struct is used to represent subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	Oid		srrelid;
+	char	srsubstate;
+	char   *srsublsn;
+} SubRelInfo;
+
 /*
  * The SubscriptionInfo struct is used to represent subscription.
  */
@@ -663,6 +673,9 @@ typedef struct _SubscriptionInfo
 	char	   *suborigin;
 	char	   *subsynccommit;
 	char	   *subpublications;
+	char	   *suboriginremotelsn;
+	int			nrels;
+	SubRelInfo *subrels;
 } SubscriptionInfo;
 
 /*
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fea159689e..7961cd8110 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_rels(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -103,6 +104,8 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_composite_data_type_usage(&old_cluster);
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
+	if (user_opts.preserve_subscriptions)
+		check_for_subscription_rels(&old_cluster);
 
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the on-disk
@@ -785,6 +788,57 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_rels()
+ *
+ * Verify that no table in a subscription is in a state different than ready.
+ */
+static void
+check_for_subscription_rels(ClusterInfo *cluster)
+{
+	int			dbnum;
+	bool		is_error = false;
+
+	Assert(user_opts.preserve_subscriptions);
+
+	/* No subscription before pg10. */
+	if (GET_MAJOR_VERSION(cluster->major_version < 1000))
+		return;
+
+	prep_status("Checking for non-ready subscription relations");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		int			nb;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		res = executeQueryOrDie(conn,
+								"SELECT count(0) "
+								"FROM pg_catalog.pg_subscription_rel "
+								"WHERE srsubstate != 'r'");
+
+		if (PQntuples(res) != 1)
+			pg_fatal("could not determine the number of non-ready subscription relations");
+
+		nb = atooid(PQgetvalue(res, 0, 0));
+		if (nb != 0)
+		{
+			is_error = true;
+			pg_log(PG_WARNING,
+				   "\nWARNING: database \"%s\" has %d subscription "
+				   "relations(s) in non-ready state", active_db->db_name, nb);
+		}
+	}
+
+	if (is_error)
+		pg_fatal("--preserve-subscription-state is incompatible with "
+				"subscription relations in non-ready state");
+
+	check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/dump.c b/src/bin/pg_upgrade/dump.c
index 6c8c82dca8..9284576af7 100644
--- a/src/bin/pg_upgrade/dump.c
+++ b/src/bin/pg_upgrade/dump.c
@@ -53,9 +53,10 @@ generate_old_dump(void)
 
 		parallel_exec_prog(log_file_name, NULL,
 						   "\"%s/pg_dump\" %s --schema-only --quote-all-identifiers "
-						   "--binary-upgrade --format=custom %s --file=\"%s/%s\" %s",
+						   "--binary-upgrade --format=custom %s %s --file=\"%s/%s\" %s",
 						   new_cluster.bindir, cluster_conn_opts(&old_cluster),
 						   log_opts.verbose ? "--verbose" : "",
+						   user_opts.preserve_subscriptions ? "--preserve-subscription-state" : "",
 						   log_opts.dumpdir,
 						   sql_file_name, escaped_connstr.data);
 
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index 8869b6b60d..b033aa26ba 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -57,6 +57,7 @@ parseCommandLine(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"clone", no_argument, NULL, 1},
 		{"copy", no_argument, NULL, 2},
+		{"preserve-subscription-state", no_argument, NULL, 3},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -66,6 +67,7 @@ parseCommandLine(int argc, char *argv[])
 
 	user_opts.do_sync = true;
 	user_opts.transfer_mode = TRANSFER_MODE_COPY;
+	user_opts.preserve_subscriptions = false;
 
 	os_info.progname = get_progname(argv[0]);
 
@@ -199,6 +201,10 @@ parseCommandLine(int argc, char *argv[])
 				user_opts.transfer_mode = TRANSFER_MODE_COPY;
 				break;
 
+			case 3:
+				user_opts.preserve_subscriptions = true;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
 						os_info.progname);
@@ -289,6 +295,7 @@ usage(void)
 	printf(_("  -V, --version                 display version information, then exit\n"));
 	printf(_("  --clone                       clone instead of copying files to new cluster\n"));
 	printf(_("  --copy                        copy files to new cluster (default)\n"));
+	printf(_("  --preserve-subscription-state preserve the subscription state fully\n"));
 	printf(_("  -?, --help                    show this help, then exit\n"));
 	printf(_("\n"
 			 "Before running pg_upgrade you must:\n"
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 3eea0139c7..131fd9a56e 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -304,6 +304,7 @@ typedef struct
 	transferMode transfer_mode; /* copy files or link them? */
 	int			jobs;			/* number of processes/threads to use */
 	char	   *socketdir;		/* directory to use for Unix sockets */
+	bool		preserve_subscriptions; /* fully transfer subscription state */
 } UserOpts;
 
 typedef struct
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 028588fb33..6b47efb884 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3921,7 +3921,8 @@ typedef enum AlterSubscriptionType
 	ALTER_SUBSCRIPTION_DROP_PUBLICATION,
 	ALTER_SUBSCRIPTION_REFRESH,
 	ALTER_SUBSCRIPTION_ENABLED,
-	ALTER_SUBSCRIPTION_SKIP
+	ALTER_SUBSCRIPTION_SKIP,
+	ALTER_SUBSCRIPTION_ADD_TABLE
 } AlterSubscriptionType;
 
 typedef struct AlterSubscriptionStmt
-- 
2.37.0

#32Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Julien Rouhaud (#31)
RE: pg_upgrade and logical replication

Dear Julien,

I'm attaching a v3 to fix a recent conflict with pg_dump due to a563c24c9574b7
(Allow pg_dump to include/exclude child tables automatically).

Thank you for making the patch.
FYI - it could not be applied due to recent commits. SUBOPT_* and attributes
in SubscriptionInfo was added these days.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

#33Julien Rouhaud
rjuju123@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#32)
1 attachment(s)
Re: pg_upgrade and logical replication

Hi,

On Thu, Apr 06, 2023 at 04:49:59AM +0000, Hayato Kuroda (Fujitsu) wrote:

Dear Julien,

I'm attaching a v3 to fix a recent conflict with pg_dump due to a563c24c9574b7
(Allow pg_dump to include/exclude child tables automatically).

Thank you for making the patch.
FYI - it could not be applied due to recent commits. SUBOPT_* and attributes
in SubscriptionInfo was added these days.

Thanks a lot for warning me!

While rebasing and testing the patch, I realized that I forgot to git-add a
chunk, so I want ahead and added some minimal TAP tests to make sure that the
feature and various checks work as expected, also demonstrating that you can
safely resume after running pg_upgrade a logical replication setup where only
some of the tables are added to a publication, where new rows and new tables
are added to the publication while pg_upgrade is running (for the new table you
obviously need to make sure that the same relation exist on the subscriber side
but that's orthogonal to this patch).

While doing so, I also realized that the subscription's underlying replication
origin remote LSN is only set after some activity is seen *after* the initial
sync, so I also added a new check in pg_upgrade to make sure that all remote
origin tied to a subscription have a valid remote_lsn when the new option is
used. Documentation is updated to cover that, same for the TAP tests.

v4 attached.

Attachments:

v4-0001-Optionally-preserve-the-full-subscription-s-state.patchtext/plain; charset=us-asciiDownload
From a5823a0ea289860367e0ebfb76c7dad7be5337e7 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Wed, 22 Feb 2023 09:19:32 +0800
Subject: [PATCH v4] Optionally preserve the full subscription's state during
 pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

Similarly, the subscription's replication origin are needed to ensure
that we don't replicate anything twice.

To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
additional commands to be able to restore the content of pg_subscription_rel,
and addition LSN parameter in the subscription creation to restore the
underlying replication origin remote LSN.  The LSN parameter is only accepted
in CREATE SUBSCRIPTION in binary upgrade mode.

The new ALTER SUBSCRIPTION subcommand, usable only during binary upgrade, has
the following syntax:

ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

The relation is identified by its oid, as it's preserved during pg_upgrade.
The lsn is optional, and defaults to NULL / InvalidXLogRecPtr if not provided.
Explicitly passing InvalidXLogRecPtr (0/0) is however not allowed.

This mode is optional and not enabled by default.  A new
--preserve-subscription-state option is added to pg_upgrade to use it.  For
now, pg_upgrade will check that all the subscription have a valid replication
origin remote_lsn, and that all underlying relations are in 'r' (ready) state,
and will error out if that's not the case, logging the reason for the failure.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml          |  19 +++
 src/backend/commands/subscriptioncmds.c  |  67 +++++++-
 src/backend/parser/gram.y                |  11 ++
 src/bin/pg_dump/pg_backup.h              |   2 +
 src/bin/pg_dump/pg_dump.c                | 114 ++++++++++++-
 src/bin/pg_dump/pg_dump.h                |  13 ++
 src/bin/pg_upgrade/check.c               |  82 +++++++++
 src/bin/pg_upgrade/dump.c                |   3 +-
 src/bin/pg_upgrade/meson.build           |   1 +
 src/bin/pg_upgrade/option.c              |   7 +
 src/bin/pg_upgrade/pg_upgrade.h          |   1 +
 src/bin/pg_upgrade/t/003_subscription.pl | 204 +++++++++++++++++++++++
 src/include/nodes/parsenodes.h           |   3 +-
 13 files changed, 522 insertions(+), 5 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/003_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 7816b4c685..b23c536954 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -240,6 +240,25 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--preserve-subscription-state</option></term>
+      <listitem>
+       <para>
+        Fully preserve the logical subscription state if any.  That includes
+        the underlying replication origin with their remote LSN and the list of
+        relations in each subscription so that replication can be simply
+        resumed if the subscriptions are reactived.
+        If that option isn't used, it is up to the user to reactivate the
+        subscriptions in a suitable way; see the subscription part in <xref
+        linkend="pg-dump-notes"/> for more information.
+        If this option is used and any of the subscription on the old cluster
+        has an unknown <varname>remote_lsn</varname> (0/0), or has any relation
+        in a state different from <literal>r</literal> (ready), the
+        <application>pg_upgrade</application> run will error.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-?</option></term>
       <term><option>--help</option></term>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3251d89ba8..4fa688d16f 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -71,6 +71,8 @@
 #define SUBOPT_RUN_AS_OWNER			0x00001000
 #define SUBOPT_LSN					0x00002000
 #define SUBOPT_ORIGIN				0x00004000
+#define SUBOPT_RELID				0x00008000
+#define SUBOPT_STATE				0x00010000
 
 /* check if the 'val' has 'bits' set */
 #define IsSet(val, bits)  (((val) & (bits)) == (bits))
@@ -97,6 +99,8 @@ typedef struct SubOpts
 	bool		runasowner;
 	char	   *origin;
 	XLogRecPtr	lsn;
+	Oid			relid;
+	char		state;
 } SubOpts;
 
 static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -353,6 +357,38 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
 			opts->specified_opts |= SUBOPT_LSN;
 			opts->lsn = lsn;
 		}
+		else if (IsSet(supported_opts, SUBOPT_RELID) &&
+				 strcmp(defel->defname, "relid") == 0)
+		{
+			Oid			relid = defGetObjectId(defel);
+
+			if (IsSet(opts->specified_opts, SUBOPT_RELID))
+				errorConflictingDefElem(defel, pstate);
+
+			if (!OidIsValid(relid))
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("invalid relation identifier used")));
+
+			opts->specified_opts |= SUBOPT_RELID;
+			opts->relid = relid;
+		}
+		else if (IsSet(supported_opts, SUBOPT_STATE) &&
+				 strcmp(defel->defname, "state") == 0)
+		{
+			char	   *state_str = defGetString(defel);
+
+			if (IsSet(opts->specified_opts, SUBOPT_STATE))
+				errorConflictingDefElem(defel, pstate);
+
+			if (strlen(state_str) != 1)
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("invalid relation state used")));
+
+			opts->specified_opts |= SUBOPT_STATE;
+			opts->state = defGetString(defel)[0];
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -580,6 +616,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	bits32		supported_opts;
 	SubOpts		opts = {0};
 	AclResult	aclresult;
+	RepOriginId	originid;
 
 	/*
 	 * Parse and check options.
@@ -592,6 +629,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 					  SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
 					  SUBOPT_DISABLE_ON_ERR | SUBOPT_PASSWORD_REQUIRED |
 					  SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN);
+	if(IsBinaryUpgrade)
+		supported_opts |= SUBOPT_LSN;
 	parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
 
 	/*
@@ -718,7 +757,12 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	recordDependencyOnOwner(SubscriptionRelationId, subid, owner);
 
 	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
-	replorigin_create(originname);
+	originid = replorigin_create(originname);
+
+	if (IsBinaryUpgrade && IsSet(opts.lsn, SUBOPT_LSN))
+		replorigin_advance(originid, opts.lsn, InvalidXLogRecPtr,
+							false /* backward */ ,
+							false /* WAL log */ );
 
 	/*
 	 * Connect to remote side to execute requested commands and fetch table
@@ -1428,6 +1472,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 				break;
 			}
 
+		case ALTER_SUBSCRIPTION_ADD_TABLE:
+			{
+				if (!IsBinaryUpgrade)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR)),
+							errmsg("ALTER SUBSCRIPTION ... ADD TABLE is not supported"));
+
+				supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN;
+				parse_subscription_options(pstate, stmt->options,
+										   supported_opts, &opts);
+
+				/* relid and state should always be provided. */
+				Assert(IsSet(opts.specified_opts, SUBOPT_RELID));
+				Assert(IsSet(opts.specified_opts, SUBOPT_STATE));
+
+				AddSubscriptionRelState(subid, opts.relid, opts.state,
+										opts.lsn);
+
+				break;
+			}
+
 		default:
 			elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
 				 stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index acf6cf4866..0432bf2cb4 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10695,6 +10695,17 @@ AlterSubscriptionStmt:
 					n->options = $5;
 					$$ = (Node *) n;
 				}
+			/* for binary upgrade only */
+			| ALTER SUBSCRIPTION name ADD_P TABLE definition
+				{
+					AlterSubscriptionStmt *n =
+						makeNode(AlterSubscriptionStmt);
+
+					n->kind = ALTER_SUBSCRIPTION_ADD_TABLE;
+					n->subname = $3;
+					n->options = $6;
+					$$ = (Node *) n;
+				}
 		;
 
 /*****************************************************************************
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index aba780ef4b..8a72a39d60 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -200,6 +200,8 @@ typedef struct _dumpOptions
 
 	int			sequence_data;	/* dump sequence data even in schema-only mode */
 	int			do_nothing;
+
+	bool		preserve_subscriptions;
 } DumpOptions;
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7a504dfe25..b0d18689a6 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -431,6 +431,7 @@ main(int argc, char **argv)
 		{"table-and-children", required_argument, NULL, 12},
 		{"exclude-table-and-children", required_argument, NULL, 13},
 		{"exclude-table-data-and-children", required_argument, NULL, 14},
+		{"preserve-subscription-state", no_argument, NULL, 15},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -657,6 +658,10 @@ main(int argc, char **argv)
 										  optarg);
 				break;
 
+			case 15:			/* include full subscription state */
+				dopt.preserve_subscriptions = true;
+				break;
+
 			default:
 				/* getopt_long already emitted a complaint */
 				pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -714,6 +719,10 @@ main(int argc, char **argv)
 	if (dopt.do_nothing && dopt.dump_inserts == 0)
 		pg_fatal("option --on-conflict-do-nothing requires option --inserts, --rows-per-insert, or --column-inserts");
 
+	/* --preserve-subscription-state requires --binary-upgrade */
+	if (dopt.preserve_subscriptions && !dopt.binary_upgrade)
+		pg_fatal("option --preserve-subscription-state requires option --binary-upgrade");
+
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
 
@@ -4585,6 +4594,69 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionRels
+ *	  get information about the given subscription's relations
+ */
+static SubRelInfo *
+getSubscriptionRels(Archive *fout, Oid subid, int *nrels)
+{
+	SubRelInfo *rels;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i,
+				ntups;
+
+	if (!fout->dopt->binary_upgrade || !fout->dopt->preserve_subscriptions)
+	{
+		*nrels = 0;
+
+		return NULL;
+	}
+
+	query = createPQExpBuffer();
+
+	appendPQExpBuffer(query, "SELECT srrelid, srsubstate, srsublsn "
+								" FROM pg_subscription_rel"
+								" WHERE srsubid = %u", subid);
+
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	*nrels = ntups;
+
+	if (ntups == 0)
+	{
+		rels = NULL;
+		goto cleanup;
+	}
+
+	/*
+	 * Get subscription relation fields.
+	 */
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	rels = pg_malloc(ntups * sizeof(SubRelInfo));
+
+	for (i = 0; i < ntups; i++)
+	{
+		rels[i].srrelid = atooid(PQgetvalue(res, i, i_srrelid));
+		rels[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		rels[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+
+	return rels;
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4610,6 +4682,7 @@ getSubscriptions(Archive *fout)
 	int			i_subpublications;
 	int			i_subbinary;
 	int			i_subpasswordrequired;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4664,15 +4737,19 @@ getSubscriptions(Archive *fout)
 	if (fout->remoteVersion >= 160000)
 		appendPQExpBufferStr(query,
 							 " s.suborigin,\n"
-							 " s.subpasswordrequired\n");
+							 " s.subpasswordrequired,\n");
 	else
 		appendPQExpBuffer(query,
 						  " '%s' AS suborigin,\n"
-						  " 't' AS subpasswordrequired\n",
+						  " 't' AS subpasswordrequired,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	appendPQExpBufferStr(query, "o.remote_lsn\n");
+
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n"
+						 "LEFT JOIN pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4698,6 +4775,7 @@ getSubscriptions(Archive *fout)
 	i_subdisableonerr = PQfnumber(res, "subdisableonerr");
 	i_suborigin = PQfnumber(res, "suborigin");
 	i_subpasswordrequired = PQfnumber(res, "subpasswordrequired");
+	i_suboriginremotelsn = PQfnumber(res, "remote_lsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4730,6 +4808,15 @@ getSubscriptions(Archive *fout)
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
 		subinfo[i].subpasswordrequired =
 			pg_strdup(PQgetvalue(res, i, i_subpasswordrequired));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+
+		subinfo[i].subrels = getSubscriptionRels(fout,
+												 subinfo[i].dobj.catId.oid,
+												 &subinfo[i].nrels);
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4814,9 +4901,31 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 	if (strcmp(subinfo->subpasswordrequired, "t") != 0)
 		appendPQExpBuffer(query, ", password_required = false");
 
+	if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
+	}
+
 	appendPQExpBufferStr(query, ");\n");
 
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		for (i = 0; i < subinfo->nrels; i++)
+		{
+			appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
+									 "(relid = %u, state = '%c'",
+									 qsubname,
+									 subinfo->subrels[i].srrelid,
+									 subinfo->subrels[i].srsubstate);
+
+			if (subinfo->subrels[i].srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", LSN = '%s'",
+								  subinfo->subrels[i].srsublsn);
+
+			appendPQExpBufferStr(query, ");");
+		}
+
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
 								  .owner = subinfo->rolname,
@@ -4824,6 +4933,7 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 								  .section = SECTION_POST_DATA,
 								  .createStmt = query->data,
 								  .dropStmt = delq->data));
+	}
 
 	if (subinfo->dobj.dump & DUMP_COMPONENT_COMMENT)
 		dumpComment(fout, "SUBSCRIPTION", qsubname,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index ed6ce41ad7..2f7e805cfc 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -647,6 +647,16 @@ typedef struct _PublicationSchemaInfo
 	NamespaceInfo *pubschema;
 } PublicationSchemaInfo;
 
+/*
+ * The SubRelInfo struct is used to represent subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	Oid		srrelid;
+	char	srsubstate;
+	char   *srsublsn;
+} SubRelInfo;
+
 /*
  * The SubscriptionInfo struct is used to represent subscription.
  */
@@ -664,6 +674,9 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *subpasswordrequired;
+	char	   *suboriginremotelsn;
+	int			nrels;
+	SubRelInfo *subrels;
 } SubscriptionInfo;
 
 /*
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fea159689e..1634b26175 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_state(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -103,6 +104,8 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_composite_data_type_usage(&old_cluster);
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
+	if (user_opts.preserve_subscriptions)
+		check_for_subscription_state(&old_cluster);
 
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the on-disk
@@ -785,6 +788,85 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_state()
+ *
+ * Verify that all subscriptions have a valid remote_lsn and doesn't contain
+ * any table in a state different than ready.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)
+{
+	int			dbnum;
+	bool		is_error = false;
+
+	Assert(user_opts.preserve_subscriptions);
+
+	/* No subscription before pg10. */
+	if (GET_MAJOR_VERSION(cluster->major_version < 1000))
+		return;
+
+	prep_status("Checking for subscription state");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		int			nb;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin_status only once. */
+		if (dbnum == 0)
+		{
+			res = executeQueryOrDie(conn,
+									"SELECT count(0) "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT JOIN pg_catalog.pg_replication_origin_status os"
+									"  ON os.external_id = 'pg_' || s.oid "
+									"WHERE coalesce(remote_lsn, '0/0') = '0/0'");
+
+			if (PQntuples(res) != 1)
+				pg_fatal("could not determine the number of remote origin with invalid remote_lsn");
+
+			nb = atooid(PQgetvalue(res, 0, 0));
+			if (nb != 0)
+			{
+				is_error = true;
+				pg_log(PG_WARNING,
+					   "\nWARNING:  %d subscription have invalid remote_lsn",
+					   nb);
+			}
+			PQclear(res);
+		}
+
+		res = executeQueryOrDie(conn,
+								"SELECT count(0) "
+								"FROM pg_catalog.pg_subscription_rel "
+								"WHERE srsubstate != 'r'");
+
+		if (PQntuples(res) != 1)
+			pg_fatal("could not determine the number of non-ready subscription relations");
+
+		nb = atooid(PQgetvalue(res, 0, 0));
+		if (nb != 0)
+		{
+			is_error = true;
+			pg_log(PG_WARNING,
+				   "\nWARNING: database \"%s\" has %d subscription "
+				   "relations(s) in non-ready state", active_db->db_name, nb);
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (is_error)
+		pg_fatal("--preserve-subscription-state is incompatible with "
+				"subscription relations in non-ready state");
+
+	check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/dump.c b/src/bin/pg_upgrade/dump.c
index 6c8c82dca8..9284576af7 100644
--- a/src/bin/pg_upgrade/dump.c
+++ b/src/bin/pg_upgrade/dump.c
@@ -53,9 +53,10 @@ generate_old_dump(void)
 
 		parallel_exec_prog(log_file_name, NULL,
 						   "\"%s/pg_dump\" %s --schema-only --quote-all-identifiers "
-						   "--binary-upgrade --format=custom %s --file=\"%s/%s\" %s",
+						   "--binary-upgrade --format=custom %s %s --file=\"%s/%s\" %s",
 						   new_cluster.bindir, cluster_conn_opts(&old_cluster),
 						   log_opts.verbose ? "--verbose" : "",
+						   user_opts.preserve_subscriptions ? "--preserve-subscription-state" : "",
 						   log_opts.dumpdir,
 						   sql_file_name, escaped_connstr.data);
 
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 12a97f84e2..9ea25dec70 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -42,6 +42,7 @@ tests += {
     'tests': [
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
+      't/003_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index 8869b6b60d..b033aa26ba 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -57,6 +57,7 @@ parseCommandLine(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"clone", no_argument, NULL, 1},
 		{"copy", no_argument, NULL, 2},
+		{"preserve-subscription-state", no_argument, NULL, 3},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -66,6 +67,7 @@ parseCommandLine(int argc, char *argv[])
 
 	user_opts.do_sync = true;
 	user_opts.transfer_mode = TRANSFER_MODE_COPY;
+	user_opts.preserve_subscriptions = false;
 
 	os_info.progname = get_progname(argv[0]);
 
@@ -199,6 +201,10 @@ parseCommandLine(int argc, char *argv[])
 				user_opts.transfer_mode = TRANSFER_MODE_COPY;
 				break;
 
+			case 3:
+				user_opts.preserve_subscriptions = true;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
 						os_info.progname);
@@ -289,6 +295,7 @@ usage(void)
 	printf(_("  -V, --version                 display version information, then exit\n"));
 	printf(_("  --clone                       clone instead of copying files to new cluster\n"));
 	printf(_("  --copy                        copy files to new cluster (default)\n"));
+	printf(_("  --preserve-subscription-state preserve the subscription state fully\n"));
 	printf(_("  -?, --help                    show this help, then exit\n"));
 	printf(_("\n"
 			 "Before running pg_upgrade you must:\n"
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 3eea0139c7..131fd9a56e 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -304,6 +304,7 @@ typedef struct
 	transferMode transfer_mode; /* copy files or link them? */
 	int			jobs;			/* number of processes/threads to use */
 	char	   *socketdir;		/* directory to use for Unix sockets */
+	bool		preserve_subscriptions; /* fully transfer subscription state */
 } UserOpts;
 
 typedef struct
diff --git a/src/bin/pg_upgrade/t/003_subscription.pl b/src/bin/pg_upgrade/t/003_subscription.pl
new file mode 100644
index 0000000000..9328b3557b
--- /dev/null
+++ b/src/bin/pg_upgrade/t/003_subscription.pl
@@ -0,0 +1,204 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use Cwd qw(abs_path);
+use File::Basename qw(dirname);
+use File::Compare;
+use File::Find qw(find);
+use File::Path qw(rmtree);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::AdjustUpgrade;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $bindir = $new_sub->config_data('--bindir');
+
+sub insert_line
+{
+	my $payload = shift;
+
+	foreach("t1", "t2")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("t1", "t2")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line('before initial sync');
+
+# Setup logical replication, replicating only 1 table
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub FOR TABLE t1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$connstr' PUBLICATION pub");
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'sub');
+
+# replication origin's remote_lsn isn't set if not data is replicated after the
+# initial sync
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+		'--preserve-subscription-state',
+		'--check',
+	],
+	'run of pg_upgrade --check for old instance with invalid remote_lsn');
+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ not removed after pg_upgrade failure");
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Make sure the replication origin is set
+insert_line('after initial sync');
+$publisher->wait_for_catchup('sub');
+
+my $result = $old_sub->safe_psql('postgres',
+    "SELECT COUNT(*) FROM pg_subscription_rel WHERE srsubstate != 'r'");
+is ($result, qq(0), "All tables in pg_subscription_rel should be in ready state");
+
+# Check the number of rows for each table on each server
+$result = $publisher->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(2), "Table t1 should have 2 rows on the publisher");
+$result = $publisher->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(2), "Table t2 should have 2 rows on the publisher");
+$result = $old_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(2), "Table t1 should have 2 rows on the old subscriber");
+$result = $old_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(0), "Table t2 should have 0 rows on the old subscriber");
+
+# Check that pg_upgrade refuses to upgrade subscription with non ready tables
+$old_sub->safe_psql('postgres',
+    "ALTER SUBSCRIPTION sub DISABLE");
+$old_sub->safe_psql('postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'i' WHERE srsubstate = 'r'");
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+		'--preserve-subscription-state',
+		'--check',
+	],
+	'run of pg_upgrade --check for old instance with incorrect sub rel');
+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ not removed after pg_upgrade failure");
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# and otherwise works
+$old_sub->safe_psql('postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'r' WHERE srsubstate = 'i'");
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+		'--preserve-subscription-state',
+		'--check',
+	],
+	'run of pg_upgrade --check for old instance with correct sub rel');
+
+# Stop the old subscriber, insert a row in each table while it's down and add
+# t2 to the publication
+$old_sub->stop;
+
+insert_line('while old_sub is down');
+
+# Run pg_upgrade
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+		'--preserve-subscription-state',
+	],
+	'run of pg_upgrade for new sub');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade success");
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub ADD TABLE t2");
+
+$new_sub->start;
+
+# There should be no new replicated rows before enabling the subscription
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(2), "Table t1 should still have 2 rows on the new subscriber");
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(0), "Table t2 should still have 0 rows on the new subscriber");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION sub ENABLE");
+
+$publisher->wait_for_catchup('sub');
+
+# Rows on t1 should have been replicated
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(3), "Table t1 should now have 3 rows on the new subscriber");
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(0), "Table t2 should still have 0 rows on the new subscriber");
+
+# Refresh the subscription, only the missing row on t2 show be replicated
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION sub REFRESH PUBLICATION");
+$publisher->wait_for_catchup('sub');
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(3), "Table t1 should still have 3 rows on the new subscriber");
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(3), "Table t2 should now have 3 rows on the new subscriber");
+
+done_testing();
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index cc7b32b279..0ec85ceda2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -4028,7 +4028,8 @@ typedef enum AlterSubscriptionType
 	ALTER_SUBSCRIPTION_DROP_PUBLICATION,
 	ALTER_SUBSCRIPTION_REFRESH,
 	ALTER_SUBSCRIPTION_ENABLED,
-	ALTER_SUBSCRIPTION_SKIP
+	ALTER_SUBSCRIPTION_SKIP,
+	ALTER_SUBSCRIPTION_ADD_TABLE
 } AlterSubscriptionType;
 
 typedef struct AlterSubscriptionStmt
-- 
2.37.0

#34Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Julien Rouhaud (#33)
RE: pg_upgrade and logical replication

Dear Julien,

Thank you for updating the patch. I checked yours.
Followings are general or non-minor questions:

1.
Feature freeze for PG16 has already come. So I think there is no reason to rush
making the patch. Based on above, could you allow to upgrade while synchronizing
data? Personally it can be added as 0002 patch which extends the feature. Or
have you already found any problem?

2.
I have a questions about the SQL interface:

ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

Here the oid of the table is directly specified, but is it really kept between
old and new node? Similar command ALTER PUBLICATION requires the name of table,
not the oid.

3.
Currently getSubscriptionRels() is called from the getSubscriptions(), but I could
not find the reason why we must do like that. Other functions like
getPublicationTables() is directly called from getSchemaData(), so they should
be followed. Additionaly, I found two problems.

* Only tables that to be dumped should be included. See getPublicationTables().
* dropStmt for subscription relations seems not to be needed.
* Maybe security label and comments should be also dumped.

Followings are minor comments.

4. parse_subscription_options

```
+ opts->state = defGetString(defel)[0]is not needed.;
```

[0]: is not needed.

5. AlterSubscription

```
+                               supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN;
+                               parse_subscription_options(pstate, stmt->options,
+                                                                                  supported_opts, &opts);
+
+                               /* relid and state should always be provided. */
+                               Assert(IsSet(opts.specified_opts, SUBOPT_RELID));
+                               Assert(IsSet(opts.specified_opts, SUBOPT_STATE));
+
```

SUBOPT_LSN accepts "none" string, which means InvalidLSN. Isn't it better to
reject it?

6. dumpSubscription()

```
+       if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+               subinfo->suboriginremotelsn)
+       {
+               appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
+       }
```

{} is not needed.

7. pg_dump.h

```
+/*
+ * The SubRelInfo struct is used to represent subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+       Oid             srrelid;
+       char    srsubstate;
+       char   *srsublsn;
+} SubRelInfo;
```

This typedef must be added to typedefs.list.

8. check_for_subscription_state

```
nb = atooid(PQgetvalue(res, 0, 0));
if (nb != 0)
{
is_error = true;
pg_log(PG_WARNING,
"\nWARNING: %d subscription have invalid remote_lsn",
nb);
}
```

I think no need to use atooid. Additionaly, isn't it better to show the name of
subscriptions which have invalid remote_lsn?

```
nb = atooid(PQgetvalue(res, 0, 0));
if (nb != 0)
{
is_error = true;
pg_log(PG_WARNING,
"\nWARNING: database \"%s\" has %d subscription "
"relations(s) in non-ready state", active_db->db_name, nb);
}
```

Same as above.

9. parseCommandLine

```
+ user_opts.preserve_subscriptions = false;
```

I think this initialization is not needed because it is default.

And maybe you missed to run pgindent.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

#35Peter Smith
smithpb2250@gmail.com
In reply to: Julien Rouhaud (#33)
Re: pg_upgrade and logical replication

Here are some review comments for patch v4-0001 (not the test code)

(There are some overlaps here with what Kuroda-san already posted
yesterday because we were looking at the same patch code. Also, a few
of my comments might become moot points if refactoring will be done
according to Kuroda-san's "general" questions).

======
Commit message

1.
To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
additional commands to be able to restore the content of pg_subscription_rel,
and addition LSN parameter in the subscription creation to restore the
underlying replication origin remote LSN. The LSN parameter is only accepted
in CREATE SUBSCRIPTION in binary upgrade mode.

~

SUGGESTION
To fix this problem, this patch teaches pg_dump in binary upgrade mode
to emit additional ALTER SUBSCRIPTION commands to facilitate restoring
the content of pg_subscription_rel, and provides an additional LSN
parameter for CREATE SUBSCRIPTION to restore the underlying
replication origin remote LSN. The new ALTER SUBSCRIPTION syntax and
new LSN parameter are not exposed to the user -- they are only
accepted in binary upgrade mode.

======
src/sgml/ref/pgupgrade.sgml

2.
+     <varlistentry>
+      <term><option>--preserve-subscription-state</option></term>
+      <listitem>
+       <para>
+        Fully preserve the logical subscription state if any.  That includes
+        the underlying replication origin with their remote LSN and the list of
+        relations in each subscription so that replication can be simply
+        resumed if the subscriptions are reactived.
+        If that option isn't used, it is up to the user to reactivate the
+        subscriptions in a suitable way; see the subscription part in <xref
+        linkend="pg-dump-notes"/> for more information.
+        If this option is used and any of the subscription on the old cluster
+        has an unknown <varname>remote_lsn</varname> (0/0), or has any relation
+        in a state different from <literal>r</literal> (ready), the
+        <application>pg_upgrade</application> run will error.
+       </para>
+      </listitem>
+     </varlistentry>

~

2a.
"If that option isn't used" --> "If this option isn't used"

~

2b.
The link renders strangely. It just says:

See the subscription part in the [section called "Notes"] for more information.

Maybe the link part can be rewritten so that it renders more nicely,
and also makes mention of pg_dump.

~

2c.
Maybe it is more readable to have the "isn't used" and "is used" parts
as separate paragraphs?

~

2d.
Typo /reactived/reactivated/ ??

======
src/backend/commands/subscriptioncmds.c

3.
+#define SUBOPT_RELID 0x00008000
+#define SUBOPT_STATE 0x00010000

Maybe 'SUBOPT_RELSTATE' is a better name for this per-relation state option?

~~~

4. SubOpts

+ Oid relid;
+ char state;
 } SubOpts;

(similar to #3)

Maybe 'relstate' is a better name for this per-relation state?

~~~

5. parse_subscription_options

+ else if (IsSet(supported_opts, SUBOPT_STATE) &&
+ strcmp(defel->defname, "state") == 0)
+ {

(similar to #3)

Maybe called this option "relstate".

~

6.
+ if (strlen(state_str) != 1)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid relation state used")));

IIUC this is syntax not supposed to be reachable by user input. Maybe
there is some merit in making the errors similar looking to the normal
options, but OTOH it could also be misleading.

This might as well just be: Assert(strlen(state_str) == 1 &&
*state_str == SUBREL_STATE_READY);
or even simply: Assert(IsBinaryUpgrade);

~~~

7. CreateSubscription

+ if(IsBinaryUpgrade)
+ supported_opts |= SUBOPT_LSN;
  parse_subscription_options(pstate, stmt->options, supported_opts, &opts);

7a.
Missing whitespace after the "if".

~

7b.
I wonder if this was deserving of a comment something like "The LSN
option is for internal use only"...

~~~

8. CreateSubscription

+ originid = replorigin_create(originname);
+
+ if (IsBinaryUpgrade && IsSet(opts.lsn, SUBOPT_LSN))
+ replorigin_advance(originid, opts.lsn, InvalidXLogRecPtr,
+ false /* backward */ ,
+ false /* WAL log */ );

I think the 'IsBinaryUpgrade' check is redundant here because
SUBOPT_LSN is not possible to be set unless that is true anyhow.

~~~

9. AlterSubscription

+ AddSubscriptionRelState(subid, opts.relid, opts.state,
+ opts.lsn);

This line wrapping of AddSubscriptionRelState seems unnecessary.

======
src/bin/pg_dump/pg_backup.h

10.
+
+ bool preserve_subscriptions;
 } DumpOptions;

Maybe name this field "preserve_subscription_state" for consistency
with the option name.

======
src/bin/pg_dump/pg_dump.c

11. dumpSubscription

  if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+ {
+ for (i = 0; i < subinfo->nrels; i++)
+ {
+ appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
+ "(relid = %u, state = '%c'",
+ qsubname,
+ subinfo->subrels[i].srrelid,
+ subinfo->subrels[i].srsubstate);
+
+ if (subinfo->subrels[i].srsublsn[0] != '\0')
+ appendPQExpBuffer(query, ", LSN = '%s'",
+   subinfo->subrels[i].srsublsn);
+
+ appendPQExpBufferStr(query, ");");
+ }
+

Maybe I misunderstood something -- Shouldn't this new ALTER
SUBSCRIPTION TABLE cmd only be happening when the option
dopt->preserve_subscriptions is true?

======
src/bin/pg_dump/pg_dump.h

12. SubRelInfo

+/*
+ * The SubRelInfo struct is used to represent subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+ Oid srrelid;
+ char srsubstate;
+ char   *srsublsn;
+} SubRelInfo;
+

12a.
"represent subscription relation" --> "represent a subscription relation"

~

12b.
Should include the indent file typdefs.list in the patch, and add this
new typedef to it.

======
src/bin/pg_upgrade/check.c

13. check_for_subscription_state

+/*
+ * check_for_subscription_state()
+ *
+ * Verify that all subscriptions have a valid remote_lsn and doesn't contain
+ * any table in a state different than ready.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)

SUGGESTION
Verify that all subscriptions have a valid remote_lsn and do not
contain any tables with srsubstate other than READY ('r').

~~~

14.
+ /* No subscription before pg10. */
+ if (GET_MAJOR_VERSION(cluster->major_version < 1000))
+ return;

14a.
The existing checking code seems slightly different to this because
the other check_XXX calls are guarded by the GET_MAJOR_VERSION before
being called.

~

14b.
Furthermore, I was confused about the combination when the < PG10 and
user_opts.preserve_subscriptions is true. Since this is just a return
(not an error) won't the subsequent pg_dump still attempt to use that
option (--preserve-subscriptions) even though we already know it
cannot work?

Would it be better to give an ERROR saying -preserve-subscriptions is
incompatible with the old PG version?

~~~

15.

+ pg_log(PG_WARNING,
+    "\nWARNING:  %d subscription have invalid remote_lsn",
+    nb);

15a.
"have invalid" --> "has invalid"

~

15b.
I guess it would be more useful if the message can include the names
of the failing subscription and/or the relation that was in the wrong
state. Maybe that means moving all this checking logic into the
pg_dump code?

======
src/bin/pg_upgrade/option.c

16. parseCommandLine

user_opts.transfer_mode = TRANSFER_MODE_COPY;
+ user_opts.preserve_subscriptions = false;

This initial assignment is not needed because user_opts is static.

======
src/bin/pg_upgrade/pg_upgrade.h

17.
char *socketdir; /* directory to use for Unix sockets */
+ bool preserve_subscriptions; /* fully transfer subscription state */
} UserOpts;

Maybe name this field 'preserve_subscription_state' to match the option.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#36Julien Rouhaud
rjuju123@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#34)
Re: pg_upgrade and logical replication

Hi,

On Wed, Apr 12, 2023 at 09:48:15AM +0000, Hayato Kuroda (Fujitsu) wrote:

Thank you for updating the patch. I checked yours.
Followings are general or non-minor questions:

Thanks!

1.
Feature freeze for PG16 has already come. So I think there is no reason to rush
making the patch. Based on above, could you allow to upgrade while synchronizing
data? Personally it can be added as 0002 patch which extends the feature. Or
have you already found any problem?

I didn't really look into it, mostly because I don't think it's a sensible
use case. Logical sync of a relation is a heavy and time consuming operation
that requires to retain the xmin for quite some time. This can already lead to
some bad effect on the publisher, so adding a pg_upgrade in the middle of that
would just make things worse. Upgrading a subscriber is a rare event that has
to be well planned (you need to test your application with the new version and
so on), initial sync of relation shouldn't happen continually, so having to
wait for the sync to be finished doesn't seem like a source of problem but
might instead avoid some for users who may not fully realize the implications.

If someone has a scenario where running pg_upgrade in the middle of a logical
sync is mandatory I can try to look at it, but for now I just don't see a good
reason to add even more complexity to this part of the code, especially since
adding regression tests seems a bit troublesome.

2.
I have a questions about the SQL interface:

ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

Here the oid of the table is directly specified, but is it really kept between
old and new node?

Yes, pg_upgrade does need to preserve relation's oid.

Similar command ALTER PUBLICATION requires the name of table,
not the oid.

Yes, but those are user facing commands, while ALTER SUBSCRIPTION name ADD
TABLE is only used internally for pg_upgrade. My goal is to make this command
a bit faster by avoiding an extra cache lookup each time, relying on pg_upgrade
existing requirements. If that's really a problem I can use the name instead
but I didn't hear any argument against it for now.

3.
Currently getSubscriptionRels() is called from the getSubscriptions(), but I could
not find the reason why we must do like that. Other functions like
getPublicationTables() is directly called from getSchemaData(), so they should
be followed.

I think you're right, doing a single getSubscriptionRels() rather than once
per subscription should be more efficient.

Additionaly, I found two problems.

* Only tables that to be dumped should be included. See getPublicationTables().

This is only done during pg_upgrade where all tables are dumped, so there
shouldn't be any need to filter the list.

* dropStmt for subscription relations seems not to be needed.

I'm not sure I understand this one. I agree that a dropStmt isn't needed, and
there's no such thing in the patch. Are you saying that you agree with it?

* Maybe security label and comments should be also dumped.

Subscription's security labels and comments are already dumped (well should be
dumped, AFAICS pg_dump was never taught to look at shared security label on
objects other than databases but still try to emit them, pg_dumpall instead
handles pg_authid and pg_tablespace), and we can't add security label or
comment on subscription's relations so I don't think this patch is missing
something?

So unless I'm missing something it looks like shared security label handling is
partly broken, but that's orthogonal to this patch.

Followings are minor comments.

4. parse_subscription_options

```
+ opts->state = defGetString(defel)[0];
```

[0] is not needed.

It still needs to be dereferenced, I personally find [0] a bit clearer in that
situation but I'm not opposed to a plain *.

5. AlterSubscription

```
+                               supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN;
+                               parse_subscription_options(pstate, stmt->options,
+                                                                                  supported_opts, &opts);
+
+                               /* relid and state should always be provided. */
+                               Assert(IsSet(opts.specified_opts, SUBOPT_RELID));
+                               Assert(IsSet(opts.specified_opts, SUBOPT_STATE));
+
```

SUBOPT_LSN accepts "none" string, which means InvalidLSN. Isn't it better to
reject it?

If you mean have an Assert for that I agree. It's not supposed to be used by
users so I don't think having non debug check is sensible, as any user provided
value has no reason to be correct anyway.

6. dumpSubscription()

```
+       if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+               subinfo->suboriginremotelsn)
+       {
+               appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
+       }
```

{} is not needed.

Yes, but the condition being on two lines it makes it more readable. I think a
lot of code uses curly braces in similar case already.

7. pg_dump.h

```
+/*
+ * The SubRelInfo struct is used to represent subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+       Oid             srrelid;
+       char    srsubstate;
+       char   *srsublsn;
+} SubRelInfo;
```

This typedef must be added to typedefs.list.

Right!

8. check_for_subscription_state

```
nb = atooid(PQgetvalue(res, 0, 0));
if (nb != 0)
{
is_error = true;
pg_log(PG_WARNING,
"\nWARNING: %d subscription have invalid remote_lsn",
nb);
}
```

I think no need to use atooid. Additionaly, isn't it better to show the name of
subscriptions which have invalid remote_lsn?

Agreed.

```
nb = atooid(PQgetvalue(res, 0, 0));
if (nb != 0)
{
is_error = true;
pg_log(PG_WARNING,
"\nWARNING: database \"%s\" has %d subscription "
"relations(s) in non-ready state", active_db->db_name, nb);
}
```

Same as above.

Agreed.

9. parseCommandLine

```
+ user_opts.preserve_subscriptions = false;
```

I think this initialization is not needed because it is default.

It's not strictly needed because of C rules but I think it doesn't really hurt
to make it explicit and not have to remember what the standard says.

And maybe you missed to run pgindent.

I indeed haven't. There will probably be a global pgindent done soon so I will
do one for this patch afterwards.

#37Peter Smith
smithpb2250@gmail.com
In reply to: Julien Rouhaud (#33)
Re: pg_upgrade and logical replication

Here are some review comments for the v4-0001 test code only.

======

1.
All the comments look alike, so it is hard to know what is going on.
If each of the main test parts could be highlighted then the test code
would be easier to read IMO.

Something like below:

# ==========
# TEST CASE: Check that pg_upgrade refuses to upgrade a subscription
when the replication origin is not set.
#
# replication origin's remote_lsn isn't set if data was not replicated after the
# initial sync.

...

# ==========
# TEST CASE: Check that pg_upgrade refuses to upgrade a subscription
with non-ready tables.

...

# ==========
# TEST CASE: Check that pg_upgrade works when all subscription tables are ready.

...

# ==========
# TEST CASE: Change the publication while the old subscriber is offline.
#
# Stop the old subscriber, insert a row in each table while it's down, and add
# t2 to the publication.

...

# ==========
# TEST CASE: Enable the subscription.

...

# ==========
# TEST CASE: Refresh the subscription to get the newly published table t2.
#
# Only the missing row on t2 show be replicated.

~~~

2.
+# replication origin's remote_lsn isn't set if not data is replicated after the
+# initial sync

wording:
/if not data is replicated/if data is not replicated/

~~~

3.
# Make sure the replication origin is set

I was not sure if all of the SELECT COUNT(*) checking is needed
because it just seems normal pub/sub functionality. There is no
pg_upgrade happening, so really it seemed the purpose of this part was
mainly to set the origin so that it will not be a blocker for
ready-state tests that follow this code. Maybe this can just be
incorporated into the following test part.

~~~

4.
# There should be no new replicated rows before enabling the subscription
$result = $new_sub->safe_psql('postgres',
"SELECT count(*) FROM t1");
is ($result, qq(2), "Table t1 should still have 2 rows on the new subscriber");

4a.
TBH, I felt it might be easier to follow if the SQL was checking for
WHERE (text = "while old_sub is down") etc, rather than just using
SELECT COUNT(*), and then trusting the comments to describe what the
different counts mean.

~

4b.
All these messages like "Table t1 should still have 2 rows on the new
subscriber" don't seem very helpful. e.g. They are not saying anything
about WHAT this is testing or WHY it should still have 2 rows.

~~~

5.
# Refresh the subscription, only the missing row on t2 show be replicated

/show/should/

------
Kind Regards,
Peter Smith.
Fujitsu Australia.

#38Julien Rouhaud
rjuju123@gmail.com
In reply to: Julien Rouhaud (#36)
Re: pg_upgrade and logical replication

On Thu, Apr 13, 2023 at 10:51:10AM +0800, Julien Rouhaud wrote:

On Wed, Apr 12, 2023 at 09:48:15AM +0000, Hayato Kuroda (Fujitsu) wrote:

5. AlterSubscription

```
+                               supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN;
+                               parse_subscription_options(pstate, stmt->options,
+                                                                                  supported_opts, &opts);
+
+                               /* relid and state should always be provided. */
+                               Assert(IsSet(opts.specified_opts, SUBOPT_RELID));
+                               Assert(IsSet(opts.specified_opts, SUBOPT_STATE));
+
```

SUBOPT_LSN accepts "none" string, which means InvalidLSN. Isn't it better to
reject it?

If you mean have an Assert for that I agree. It's not supposed to be used by
users so I don't think having non debug check is sensible, as any user provided
value has no reason to be correct anyway.

After looking at the code I remember that I kept the lsn optional in ALTER
SUBSCRIPTION name ADD TABLE command processing. For now pg_upgrade checks that
all subscriptions have a valid remote_lsn so there should indeed always be a
value different from InvalidLSN/none specified, but it's still unclear to me
whether this check will eventually be weakened or not, so for now I think it's
better to keep AlterSubscription accept this case, here and in all other code
paths.

If there's a hard objection I will just make the lsn mandatory.

9. parseCommandLine

```
+ user_opts.preserve_subscriptions = false;
```

I think this initialization is not needed because it is default.

It's not strictly needed because of C rules but I think it doesn't really hurt
to make it explicit and not have to remember what the standard says.

So I looked at nearby code and other option do rely on zero-initialized global
variables, so I agree that this initialization should be removed.

#39Julien Rouhaud
rjuju123@gmail.com
In reply to: Peter Smith (#35)
Re: pg_upgrade and logical replication

Hi,

On Thu, Apr 13, 2023 at 12:42:05PM +1000, Peter Smith wrote:

Here are some review comments for patch v4-0001 (not the test code)

Thanks!

(There are some overlaps here with what Kuroda-san already posted
yesterday because we were looking at the same patch code. Also, a few
of my comments might become moot points if refactoring will be done
according to Kuroda-san's "general" questions).

Ok, for the record, the parts I don't reply to are things I fully agree with
and already changed locally.

======
Commit message

1.
To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
additional commands to be able to restore the content of pg_subscription_rel,
and addition LSN parameter in the subscription creation to restore the
underlying replication origin remote LSN. The LSN parameter is only accepted
in CREATE SUBSCRIPTION in binary upgrade mode.

~

SUGGESTION
To fix this problem, this patch teaches pg_dump in binary upgrade mode
to emit additional ALTER SUBSCRIPTION commands to facilitate restoring
the content of pg_subscription_rel, and provides an additional LSN
parameter for CREATE SUBSCRIPTION to restore the underlying
replication origin remote LSN. The new ALTER SUBSCRIPTION syntax and
new LSN parameter are not exposed to the user -- they are only
accepted in binary upgrade mode.

Thanks, I eventually adapted a bit more the suggested wording:

To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
additional ALTER SUBSCRIPTION subcommands that will restore the content of
pg_subscription_rel, and also provides an additional LSN parameter for CREATE
SUBSCRIPTION to restore the underlying replication origin remote LSN. The new
ALTER SUBSCRIPTION subcommand and the new LSN parameter are not exposed to
users and only accepted in binary upgrade mode.

The new ALTER SUBSCRIPTION subcommand has the following syntax:

2b.
The link renders strangely. It just says:

See the subscription part in the [section called "Notes"] for more information.

Maybe the link part can be rewritten so that it renders more nicely,
and also makes mention of pg_dump.

Yes I saw that. I didn't try to look at it yet but that's indeed what I wanted
to do eventually.

======
src/backend/commands/subscriptioncmds.c

3.
+#define SUBOPT_RELID 0x00008000
+#define SUBOPT_STATE 0x00010000

Maybe 'SUBOPT_RELSTATE' is a better name for this per-relation state option?

I looked at it but part of the existing code is already using state as a
variable name, to be consistent with pg_subscription_rel.srsubstate. I think
it's better to use the same pattern in this patch.

6.
+ if (strlen(state_str) != 1)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid relation state used")));

IIUC this is syntax not supposed to be reachable by user input. Maybe
there is some merit in making the errors similar looking to the normal
options, but OTOH it could also be misleading.

It doesn't cost much and may be helpful for debugging so I will use error
messages similar to the error facing ones.

This might as well just be: Assert(strlen(state_str) == 1 &&
*state_str == SUBREL_STATE_READY);
or even simply: Assert(IsBinaryUpgrade);

As I mentioned in a previous email, it's still unclear to me whether the
restriction on the srsubstate will be weakened or not, so I prefer to keep such
part of the code generic and have the restriction centralized in the pg_upgrade
check.

I added some Assert(IsBinaryUpgrade) in those code path as it may not be
evident in this place that it's a requirement.

7. CreateSubscription

+ if(IsBinaryUpgrade)
+ supported_opts |= SUBOPT_LSN;
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
7b.
I wonder if this was deserving of a comment something like "The LSN
option is for internal use only"...

I was thinking that being valid only for IsBinaryUpgrade would be enough?

8. CreateSubscription

+ originid = replorigin_create(originname);
+
+ if (IsBinaryUpgrade && IsSet(opts.lsn, SUBOPT_LSN))
+ replorigin_advance(originid, opts.lsn, InvalidXLogRecPtr,
+ false /* backward */ ,
+ false /* WAL log */ );

I think the 'IsBinaryUpgrade' check is redundant here because
SUBOPT_LSN is not possible to be set unless that is true anyhow.

It's indeed redundant for now, but it's also used as a safeguard if some code
is changed. Maybe just having an assert(IsBinaryUpgrade) would be better
though.

While looking at it I noticed that this code was never reached, as I should
have checked IsSet(opts.specified_opts, ...). I fixed that and added a TAP
test to make sure that the restored remote_lsn is the same as on the old
subscription node.

9. AlterSubscription

+ AddSubscriptionRelState(subid, opts.relid, opts.state,
+ opts.lsn);

This line wrapping of AddSubscriptionRelState seems unnecessary.

Without it the line reaches 81 characters :(

======
src/bin/pg_dump/pg_backup.h

10.
+
+ bool preserve_subscriptions;
} DumpOptions;

Maybe name this field "preserve_subscription_state" for consistency
with the option name.

That's what I thought when I first wrote that code but I quickly had to use a
shorter name to avoid bloating the line length everywhere.

======
src/bin/pg_dump/pg_dump.c

11. dumpSubscription

if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+ {
+ for (i = 0; i < subinfo->nrels; i++)
+ {
+ appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
+ "(relid = %u, state = '%c'",
+ qsubname,
+ subinfo->subrels[i].srrelid,
+ subinfo->subrels[i].srsubstate);
+
+ if (subinfo->subrels[i].srsublsn[0] != '\0')
+ appendPQExpBuffer(query, ", LSN = '%s'",
+   subinfo->subrels[i].srsublsn);
+
+ appendPQExpBufferStr(query, ");");
+ }
+

Maybe I misunderstood something -- Shouldn't this new ALTER
SUBSCRIPTION TABLE cmd only be happening when the option
dopt->preserve_subscriptions is true?

It indirectly is, as in that case subinfo->nrels is guaranteed to be 0. I just
tried to keep the code simpler and avoid too many nested conditions.

12b.
Should include the indent file typdefs.list in the patch, and add this
new typedef to it.

FTR I checked and there wasn't too many noise when running pgindent on the
touched files, so I already locally added the new typedef and ran pgindent.

14.
+ /* No subscription before pg10. */
+ if (GET_MAJOR_VERSION(cluster->major_version < 1000))
+ return;

14a.
The existing checking code seems slightly different to this because
the other check_XXX calls are guarded by the GET_MAJOR_VERSION before
being called.

No opinion on that, so I moved all the checks on the caller side.

14b.
Furthermore, I was confused about the combination when the < PG10 and
user_opts.preserve_subscriptions is true. Since this is just a return
(not an error) won't the subsequent pg_dump still attempt to use that
option (--preserve-subscriptions) even though we already know it
cannot work?

Will it error out though? I haven't tried but I think it will just silently do
nothing, which maybe isn't ideal, but may be somewhat expected if you try to
preserve something that doesn't exist.

Would it be better to give an ERROR saying -preserve-subscriptions is
incompatible with the old PG version?

I'm not opposed to adding some error, but I don't really know where it would
really be suitable. Maybe in the same code path explicitly error out if the
preserve subscription option is used with a pg10- source server?

15b.
I guess it would be more useful if the message can include the names
of the failing subscription and/or the relation that was in the wrong
state. Maybe that means moving all this checking logic into the
pg_dump code?

I think it's better to have the checks only once, so in pg_upgrade, but I'm not
strongly opposed to duplicate those tests if there's any complaint. In the
meantime I rephrased the warning to give the name of the problematic
subscription (but not the list of relation, as it's more likely to be a long
list and it's easy to check manually afterwards and/or wait for all sync to
finish).

#40Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Julien Rouhaud (#36)
RE: pg_upgrade and logical replication

Dear Julien,

I didn't really look into it, mostly because I don't think it's a sensible
use case. Logical sync of a relation is a heavy and time consuming operation
that requires to retain the xmin for quite some time. This can already lead to
some bad effect on the publisher, so adding a pg_upgrade in the middle of that
would just make things worse. Upgrading a subscriber is a rare event that has
to be well planned (you need to test your application with the new version and
so on), initial sync of relation shouldn't happen continually, so having to
wait for the sync to be finished doesn't seem like a source of problem but
might instead avoid some for users who may not fully realize the implications.

If someone has a scenario where running pg_upgrade in the middle of a logical
sync is mandatory I can try to look at it, but for now I just don't see a good
reason to add even more complexity to this part of the code, especially since
adding regression tests seems a bit troublesome.

I do not have any scenarios which run pg_upgrade while synchronization because I
agree that upgrading can be well planned. So it may be OK not to add it in order
to keep the patch simpler.

Here the oid of the table is directly specified, but is it really kept between
old and new node?

Yes, pg_upgrade does need to preserve relation's oid.

I confirmed and agreed. dumpTableSchema() dumps an additional function
pg_catalog.binary_upgrade_set_next_heap_pg_class_oid() before each CREATE TABLE
statements. The function force the table to have the specified OID.

Similar command ALTER PUBLICATION requires the name of table,
not the oid.

Yes, but those are user facing commands, while ALTER SUBSCRIPTION name
ADD
TABLE is only used internally for pg_upgrade. My goal is to make this command
a bit faster by avoiding an extra cache lookup each time, relying on pg_upgrade
existing requirements. If that's really a problem I can use the name instead
but I didn't hear any argument against it for now.

OK, make sense.

3.
Currently getSubscriptionRels() is called from the getSubscriptions(), but I

could

not find the reason why we must do like that. Other functions like
getPublicationTables() is directly called from getSchemaData(), so they should
be followed.

I think you're right, doing a single getSubscriptionRels() rather than once
per subscription should be more efficient.

Yes, we do not have to divide reading pg_subscription_rel per subscriptions.

Additionaly, I found two problems.

* Only tables that to be dumped should be included. See getPublicationTables().

This is only done during pg_upgrade where all tables are dumped, so there
shouldn't be any need to filter the list.

* dropStmt for subscription relations seems not to be needed.

I'm not sure I understand this one. I agree that a dropStmt isn't needed, and
there's no such thing in the patch. Are you saying that you agree with it?

Sorry for unclear suggestion. I meant to say that we could keep current style even
if getSubscriptionRels() is called separately. Your understanding which it is not
needed is right.

* Maybe security label and comments should be also dumped.

Subscription's security labels and comments are already dumped (well should be
dumped, AFAICS pg_dump was never taught to look at shared security label on
objects other than databases but still try to emit them, pg_dumpall instead
handles pg_authid and pg_tablespace), and we can't add security label or
comment on subscription's relations so I don't think this patch is missing
something?

So unless I'm missing something it looks like shared security label handling is
partly broken, but that's orthogonal to this patch.

Followings are minor comments.

4. parse_subscription_options

```
+ opts->state = defGetString(defel)[0];
```

[0] is not needed.

It still needs to be dereferenced, I personally find [0] a bit clearer in that
situation but I'm not opposed to a plain *.

Sorry, I was confused. You are right.

5. AlterSubscription

```
+ supported_opts = SUBOPT_RELID |

SUBOPT_STATE | SUBOPT_LSN;

+ parse_subscription_options(pstate,

stmt->options,

+

supported_opts, &opts);

+
+                               /* relid and state should always be

provided. */

+ Assert(IsSet(opts.specified_opts,

SUBOPT_RELID));

+ Assert(IsSet(opts.specified_opts,

SUBOPT_STATE));

+
```

SUBOPT_LSN accepts "none" string, which means InvalidLSN. Isn't it better to
reject it?

If you mean have an Assert for that I agree. It's not supposed to be used by
users so I don't think having non debug check is sensible, as any user provided
value has no reason to be correct anyway.

Yes, I meant to request to add an Assert. Maybe you can add:
Assert(IsSet(opts.specified_opts, SUBOPT_LSN) && !XLogRecPtrIsInvalid(opts.lsn));

After looking at the code I remember that I kept the lsn optional in ALTER
SUBSCRIPTION name ADD TABLE command processing. For now pg_upgrade checks that
all subscriptions have a valid remote_lsn so there should indeed always be a
value different from InvalidLSN/none specified, but it's still unclear to me
whether this check will eventually be weakened or not, so for now I think it's
better to keep AlterSubscription accept this case, here and in all other code
paths.

If there's a hard objection I will just make the lsn mandatory.

I have tested, but srsublsn became NULL if copy_data was specified as off.
This is because when copy_data is false, all tuples in pg_subscription_rels are filled
as state = 'r' and srsublsn = NULL, and tablesync workers will never boot.
See CreateSubscription().
Doesn't it mean that there is a possibility that LSN option is not specified while
ALTER SUBSCRIPTION ADD TABLE?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

#41Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Julien Rouhaud (#33)
RE: pg_upgrade and logical replication

Dear Julien,

I found a cfbot failure on macOS [1]https://cirrus-ci.com/task/6563827802701824. According to the log,
"SELECT count(*) FROM t2" was executed before synchronization was done.

```
[09:24:21.018](0.132s) not ok 18 - Table t2 should now have 3 rows on the new subscriber
```

With the patch present, wait_for_catchup() is executed after REFRESH, but
it may not be sufficient because it does not check pg_subscription_rel.
wait_for_subscription_sync() seems better for the purpose.

[1]: https://cirrus-ci.com/task/6563827802701824

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

#42Julien Rouhaud
rjuju123@gmail.com
In reply to: Peter Smith (#37)
Re: pg_upgrade and logical replication

Hi,

On Thu, Apr 13, 2023 at 03:26:56PM +1000, Peter Smith wrote:

1.
All the comments look alike, so it is hard to know what is going on.
If each of the main test parts could be highlighted then the test code
would be easier to read IMO.

Something like below:
[...]

I added a bit more comments about what's is being tested. I'm not sure that a
big TEST CASE prefix is necessary, as it's not really multiple separated test
cases and other stuff can be tested in between. Also AFAICT no other TAP test
current needs this kind of banner, even if they're testing more complex
scenario.

2.
+# replication origin's remote_lsn isn't set if not data is replicated after the
+# initial sync

wording:
/if not data is replicated/if data is not replicated/

I actually mean "if no data", which is a bit different than what you suggest.
Fixed.

3.
# Make sure the replication origin is set

I was not sure if all of the SELECT COUNT(*) checking is needed
because it just seems normal pub/sub functionality. There is no
pg_upgrade happening, so really it seemed the purpose of this part was
mainly to set the origin so that it will not be a blocker for
ready-state tests that follow this code. Maybe this can just be
incorporated into the following test part.

Since this patch is transferring internal details about subscriptions I prefer
to be thorough about what is tested, when data is actually being replicated and
so on so if something is broken (relation added to the wrong subscription,
wrong oid or something) it should immediately show what's happening.

4a.
TBH, I felt it might be easier to follow if the SQL was checking for
WHERE (text = "while old_sub is down") etc, rather than just using
SELECT COUNT(*), and then trusting the comments to describe what the
different counts mean.

I prefer the plain count as it's a simple way to make sure that the state is
exactly what's wanted. If for some reason the patch leads to previous row
being replicated again, such a test wouldn't reveal it. Sure, it could be
broken enough so that one old row is replicated twice and the new row isn't
replicated, but it seems so unlikely that I don't think that testing the whole
table content is necessary.

4b.
All these messages like "Table t1 should still have 2 rows on the new
subscriber" don't seem very helpful. e.g. They are not saying anything
about WHAT this is testing or WHY it should still have 2 rows.

I don't think that those messages are supposed to say what or why something is
tested, just give a quick context / reference on the test in case it's broken.
The comments are there to explain in more details what is tested and/or why.

5.
# Refresh the subscription, only the missing row on t2 show be replicated

/show/should/

Fixed.

#43Julien Rouhaud
rjuju123@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#40)
Re: pg_upgrade and logical replication

Hi,

On Fri, Apr 14, 2023 at 04:19:35AM +0000, Hayato Kuroda (Fujitsu) wrote:

I have tested, but srsublsn became NULL if copy_data was specified as off.
This is because when copy_data is false, all tuples in pg_subscription_rels are filled
as state = 'r' and srsublsn = NULL, and tablesync workers will never boot.
See CreateSubscription().
Doesn't it mean that there is a possibility that LSN option is not specified while
ALTER SUBSCRIPTION ADD TABLE?

It shouldn't be the case for now, as pg_upgrade will check first if there's a
invalid remote_lsn and refuse to proceed if that's the case. Also, the
remote_lsn should be set as soon as some data is replicated, so unless you add
a table that's never modified to a publication you should be able to run
pg_upgrade at some point, once there's replicated DML on such a table.

I'm personally fine with the current restrictions, but I don't really use
logical replication in any project so maybe I'm not objective enough. For now
I'd rather keep things as-is, and later improve on it if some people want to
lift such restrictions (and such restrictions can actually be lifted).

#44Julien Rouhaud
rjuju123@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#41)
1 attachment(s)
Re: pg_upgrade and logical replication

Hi,

On Tue, Apr 18, 2023 at 01:40:51AM +0000, Hayato Kuroda (Fujitsu) wrote:

I found a cfbot failure on macOS [1]. According to the log,
"SELECT count(*) FROM t2" was executed before synchronization was done.

```
[09:24:21.018](0.132s) not ok 18 - Table t2 should now have 3 rows on the new subscriber
```

With the patch present, wait_for_catchup() is executed after REFRESH, but
it may not be sufficient because it does not check pg_subscription_rel.
wait_for_subscription_sync() seems better for the purpose.

Fixed, thanks!

v5 attached with all previously mentioned fixes.

Attachments:

v5-0001-Optionally-preserve-the-full-subscription-s-state.patchtext/plain; charset=us-asciiDownload
From c6755ea3318220dc41bc315cc7acce4954e9b252 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Wed, 22 Feb 2023 09:19:32 +0800
Subject: [PATCH v5] Optionally preserve the full subscription's state during
 pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

Similarly, the subscription's replication origin are needed to ensure
that we don't replicate anything twice.

To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
additional ALTER SUBSCRIPTION subcommands that will restore the content of
pg_subscription_rel, and also provides an additional LSN parameter for CREATE
SUBSCRIPTION to restore the underlying replication origin remote LSN.  The new
ALTER SUBSCRIPTION subcommand and the new LSN parameter are not exposed to
users and only accepted in binary upgrade mode.

The new ALTER SUBSCRIPTION subcommand has the following syntax:

ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

The relation is identified by its oid, as it's preserved during pg_upgrade.
The lsn is optional, and defaults to NULL / InvalidXLogRecPtr if not provided.
Explicitly passing InvalidXLogRecPtr (0/0) is however not allowed.

This mode is optional and not enabled by default.  A new
--preserve-subscription-state option is added to pg_upgrade to use it.  For
now, pg_upgrade will check that all the subscription have a valid replication
origin remote_lsn, and that all underlying relations are in 'r' (ready) state,
and will error out if that's not the case, logging the reason for the failure.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml          |  23 +++
 src/backend/commands/subscriptioncmds.c  |  75 +++++++-
 src/backend/parser/gram.y                |  11 ++
 src/bin/pg_dump/common.c                 |  22 +++
 src/bin/pg_dump/pg_backup.h              |   2 +
 src/bin/pg_dump/pg_dump.c                | 136 +++++++++++++-
 src/bin/pg_dump/pg_dump.h                |  15 ++
 src/bin/pg_upgrade/check.c               |  81 +++++++++
 src/bin/pg_upgrade/dump.c                |   3 +-
 src/bin/pg_upgrade/meson.build           |   1 +
 src/bin/pg_upgrade/option.c              |   6 +
 src/bin/pg_upgrade/pg_upgrade.h          |   1 +
 src/bin/pg_upgrade/t/003_subscription.pl | 220 +++++++++++++++++++++++
 src/include/nodes/parsenodes.h           |   3 +-
 src/tools/pgindent/typedefs.list         |   1 +
 15 files changed, 595 insertions(+), 5 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/003_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 7816b4c685..6af790c986 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -240,6 +240,29 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--preserve-subscription-state</option></term>
+      <listitem>
+       <para>
+        Fully preserve the logical subscription state if any.  That includes
+        the underlying replication origin with their remote LSN and the list of
+        relations in each subscription so that replication can be simply
+        resumed if the subscriptions are reactivated.
+       </para>
+       <para>
+        If this option isn't used, it is up to the user to reactivate the
+        subscriptions in a suitable way; see the subscription part in <xref
+        linkend="pg-dump-notes"/> for more information.
+       </para>
+       <para>
+        If this option is used and any of the subscription on the old cluster
+        has an unknown <varname>remote_lsn</varname> (0/0), or has any relation
+        in a state different from <literal>r</literal> (ready), the
+        <application>pg_upgrade</application> run will error.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-?</option></term>
       <term><option>--help</option></term>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 56eafbff10..657db3791e 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -71,6 +71,8 @@
 #define SUBOPT_RUN_AS_OWNER			0x00001000
 #define SUBOPT_LSN					0x00002000
 #define SUBOPT_ORIGIN				0x00004000
+#define SUBOPT_RELID				0x00008000
+#define SUBOPT_STATE				0x00010000
 
 /* check if the 'val' has 'bits' set */
 #define IsSet(val, bits)  (((val) & (bits)) == (bits))
@@ -97,6 +99,8 @@ typedef struct SubOpts
 	bool		runasowner;
 	char	   *origin;
 	XLogRecPtr	lsn;
+	Oid			relid;
+	char		state;
 } SubOpts;
 
 static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -353,6 +357,46 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
 			opts->specified_opts |= SUBOPT_LSN;
 			opts->lsn = lsn;
 		}
+		else if (IsSet(supported_opts, SUBOPT_RELID) &&
+				 strcmp(defel->defname, "relid") == 0)
+		{
+			Oid			relid = defGetObjectId(defel);
+
+			Assert(IsBinaryUpgrade);
+
+			if (IsSet(opts->specified_opts, SUBOPT_RELID))
+				errorConflictingDefElem(defel, pstate);
+
+			if (!OidIsValid(relid))
+			{
+				char	   *rel_str = defGetString(defel);
+
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("invalid relation identifier used: %s", rel_str)));
+			}
+
+			opts->specified_opts |= SUBOPT_RELID;
+			opts->relid = relid;
+		}
+		else if (IsSet(supported_opts, SUBOPT_STATE) &&
+				 strcmp(defel->defname, "state") == 0)
+		{
+			char	   *state_str = defGetString(defel);
+
+			Assert(IsBinaryUpgrade);
+
+			if (IsSet(opts->specified_opts, SUBOPT_STATE))
+				errorConflictingDefElem(defel, pstate);
+
+			if (strlen(state_str) != 1)
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("invalid relation state: %s", state_str)));
+
+			opts->specified_opts |= SUBOPT_STATE;
+			opts->state = defGetString(defel)[0];
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -580,6 +624,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	bits32		supported_opts;
 	SubOpts		opts = {0};
 	AclResult	aclresult;
+	RepOriginId originid;
 
 	/*
 	 * Parse and check options.
@@ -592,6 +637,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 					  SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
 					  SUBOPT_DISABLE_ON_ERR | SUBOPT_PASSWORD_REQUIRED |
 					  SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN);
+	if (IsBinaryUpgrade)
+		supported_opts |= SUBOPT_LSN;
 	parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
 
 	/*
@@ -718,7 +765,12 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	recordDependencyOnOwner(SubscriptionRelationId, subid, owner);
 
 	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
-	replorigin_create(originname);
+	originid = replorigin_create(originname);
+
+	if (IsBinaryUpgrade && IsSet(opts.specified_opts, SUBOPT_LSN))
+		replorigin_advance(originid, opts.lsn, InvalidXLogRecPtr,
+						   false /* backward */ ,
+						   false /* WAL log */ );
 
 	/*
 	 * Connect to remote side to execute requested commands and fetch table
@@ -1428,6 +1480,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 				break;
 			}
 
+		case ALTER_SUBSCRIPTION_ADD_TABLE:
+			{
+				if (!IsBinaryUpgrade)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR)),
+							errmsg("ALTER SUBSCRIPTION ... ADD TABLE is not supported"));
+
+				supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN;
+				parse_subscription_options(pstate, stmt->options,
+										   supported_opts, &opts);
+
+				/* relid and state should always be provided. */
+				Assert(IsSet(opts.specified_opts, SUBOPT_RELID));
+				Assert(IsSet(opts.specified_opts, SUBOPT_STATE));
+
+				AddSubscriptionRelState(subid, opts.relid, opts.state,
+										opts.lsn);
+
+				break;
+			}
+
 		default:
 			elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
 				 stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index acf6cf4866..0432bf2cb4 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10695,6 +10695,17 @@ AlterSubscriptionStmt:
 					n->options = $5;
 					$$ = (Node *) n;
 				}
+			/* for binary upgrade only */
+			| ALTER SUBSCRIPTION name ADD_P TABLE definition
+				{
+					AlterSubscriptionStmt *n =
+						makeNode(AlterSubscriptionStmt);
+
+					n->kind = ALTER_SUBSCRIPTION_ADD_TABLE;
+					n->subname = $3;
+					n->options = $6;
+					$$ = (Node *) n;
+				}
 		;
 
 /*****************************************************************************
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 5d988986ed..29d2cc7cee 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -264,6 +265,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -974,6 +978,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index aba780ef4b..8c82657e76 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -200,6 +200,8 @@ typedef struct _dumpOptions
 
 	int			sequence_data;	/* dump sequence data even in schema-only mode */
 	int			do_nothing;
+
+	int		preserve_subscriptions;
 } DumpOptions;
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 058244cd17..a5336acb5b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -431,6 +431,7 @@ main(int argc, char **argv)
 		{"table-and-children", required_argument, NULL, 12},
 		{"exclude-table-and-children", required_argument, NULL, 13},
 		{"exclude-table-data-and-children", required_argument, NULL, 14},
+		{"preserve-subscription-state", no_argument, &dopt.preserve_subscriptions, 1},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -714,6 +715,10 @@ main(int argc, char **argv)
 	if (dopt.do_nothing && dopt.dump_inserts == 0)
 		pg_fatal("option --on-conflict-do-nothing requires option --inserts, --rows-per-insert, or --column-inserts");
 
+	/* --preserve-subscription-state requires --binary-upgrade */
+	if (dopt.preserve_subscriptions && !dopt.binary_upgrade)
+		pg_fatal("option --preserve-subscription-state requires option --binary-upgrade");
+
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
 
@@ -4585,6 +4590,92 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  get information about the given subscription's relations
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	SubscriptionInfo *subinfo;
+	SubRelInfo *rels = NULL;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i_nrels;
+	int			i,
+				cur_rel = 0,
+				ntups,
+				last_srsubid = InvalidOid;
+
+	if (!fout->dopt->binary_upgrade || !fout->dopt->preserve_subscriptions ||
+		fout->remoteVersion < 100000)
+	{
+		return;
+	}
+
+	query = createPQExpBuffer();
+
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn,"
+					  " count(*) OVER (PARTITION BY srsubid) AS nrels"
+					  " FROM pg_subscription_rel"
+					  " ORDER BY srsubid");
+
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+
+	if (ntups == 0)
+		goto cleanup;
+
+	/*
+	 * Get subscription relation fields.
+	 */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+	i_nrels = PQfnumber(res, "nrels");
+
+	for (i = 0; i < ntups; i++)
+	{
+		int			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+
+		/*
+		 * If we switched to a new subscription, setup the necessary fields in
+		 * the SubscriptionInfo and reset the cur_rel counter.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			int			nrels;
+
+			subinfo = findSubscriptionByOid(cur_srsubid);
+
+			nrels = atooid(PQgetvalue(res, i, i_nrels));
+			rels = pg_malloc(nrels * sizeof(SubRelInfo));
+
+			subinfo->subrels = rels;
+			subinfo->nrels = nrels;
+
+			last_srsubid = cur_srsubid;
+			cur_rel = 0;
+		}
+
+		rels[cur_rel].srrelid = atooid(PQgetvalue(res, i, i_srrelid));
+		rels[cur_rel].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		rels[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		cur_rel++;
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4610,6 +4701,7 @@ getSubscriptions(Archive *fout)
 	int			i_subpublications;
 	int			i_subbinary;
 	int			i_subpasswordrequired;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4664,15 +4756,19 @@ getSubscriptions(Archive *fout)
 	if (fout->remoteVersion >= 160000)
 		appendPQExpBufferStr(query,
 							 " s.suborigin,\n"
-							 " s.subpasswordrequired\n");
+							 " s.subpasswordrequired,\n");
 	else
 		appendPQExpBuffer(query,
 						  " '%s' AS suborigin,\n"
-						  " 't' AS subpasswordrequired\n",
+						  " 't' AS subpasswordrequired,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	appendPQExpBufferStr(query, "o.remote_lsn\n");
+
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n"
+						 "LEFT JOIN pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4698,6 +4794,7 @@ getSubscriptions(Archive *fout)
 	i_subdisableonerr = PQfnumber(res, "subdisableonerr");
 	i_suborigin = PQfnumber(res, "suborigin");
 	i_subpasswordrequired = PQfnumber(res, "subpasswordrequired");
+	i_suboriginremotelsn = PQfnumber(res, "remote_lsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4730,6 +4827,18 @@ getSubscriptions(Archive *fout)
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
 		subinfo[i].subpasswordrequired =
 			pg_strdup(PQgetvalue(res, i, i_subpasswordrequired));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+
+		/*
+		 * For now assume there's no relation associated with the
+		 * subscription. Later code might update this field and allocate
+		 * subrels as needed.
+		 */
+		subinfo[i].nrels = 0;
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4814,9 +4923,31 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 	if (strcmp(subinfo->subpasswordrequired, "t") != 0)
 		appendPQExpBuffer(query, ", password_required = false");
 
+	if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
+	}
+
 	appendPQExpBufferStr(query, ");\n");
 
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		for (i = 0; i < subinfo->nrels; i++)
+		{
+			appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
+							  "(relid = %u, state = '%c'",
+							  qsubname,
+							  subinfo->subrels[i].srrelid,
+							  subinfo->subrels[i].srsubstate);
+
+			if (subinfo->subrels[i].srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", LSN = '%s'",
+								  subinfo->subrels[i].srsublsn);
+
+			appendPQExpBufferStr(query, ");");
+		}
+
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
 								  .owner = subinfo->rolname,
@@ -4824,6 +4955,7 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 								  .section = SECTION_POST_DATA,
 								  .createStmt = query->data,
 								  .dropStmt = delq->data));
+	}
 
 	if (subinfo->dobj.dump & DUMP_COMPONENT_COMMENT)
 		dumpComment(fout, "SUBSCRIPTION", qsubname,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index ed6ce41ad7..b9a39655c6 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -647,6 +647,16 @@ typedef struct _PublicationSchemaInfo
 	NamespaceInfo *pubschema;
 } PublicationSchemaInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	Oid			srrelid;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  * The SubscriptionInfo struct is used to represent subscription.
  */
@@ -664,6 +674,9 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *subpasswordrequired;
+	char	   *suboriginremotelsn;
+	int			nrels;
+	SubRelInfo *subrels;
 } SubscriptionInfo;
 
 /*
@@ -690,6 +703,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -749,5 +763,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fea159689e..e5dc0bd3c2 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_state(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -104,6 +105,13 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
+	/* PG 10 introduced subscriptions. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1000 &&
+		user_opts.preserve_subscriptions)
+	{
+		check_for_subscription_state(&old_cluster);
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the on-disk
 	 * format for existing data.
@@ -785,6 +793,79 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_state()
+ *
+ * Verify that all subscriptions have a valid remote_lsn and don't contain
+ * any table in srsubstate different than ready ('r').
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)
+{
+	int			dbnum;
+	bool		is_error = false;
+
+	Assert(user_opts.preserve_subscriptions);
+
+	prep_status("Checking for subscription state");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin_status only once. */
+		if (dbnum == 0)
+		{
+			int			ntup;
+
+			res = executeQueryOrDie(conn,
+									"SELECT s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT JOIN pg_catalog.pg_replication_origin_status os"
+									"  ON os.external_id = 'pg_' || s.oid "
+									"WHERE coalesce(remote_lsn, '0/0') = '0/0'");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				is_error = true;
+				pg_log(PG_WARNING,
+					   "\nWARNING:  subscription \"%s\" has an invalid remote_lsn",
+					   PQgetvalue(res, 0, 0));
+			}
+			PQclear(res);
+		}
+
+		res = executeQueryOrDie(conn,
+								"SELECT count(0) "
+								"FROM pg_catalog.pg_subscription_rel "
+								"WHERE srsubstate != 'r'");
+
+		if (PQntuples(res) != 1)
+			pg_fatal("could not determine the number of non-ready subscription relations");
+
+		if (strcmp(PQgetvalue(res, 0, 0), "0") != 0)
+		{
+			is_error = true;
+			pg_log(PG_WARNING,
+				   "\nWARNING: database \"%s\" has %s subscription "
+				   "relations(s) in non-ready state", active_db->db_name,
+				   PQgetvalue(res, 0, 0));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (is_error)
+		pg_fatal("--preserve-subscription-state is incompatible with "
+				 "subscription relations in non-ready state");
+
+	check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/dump.c b/src/bin/pg_upgrade/dump.c
index 6c8c82dca8..9284576af7 100644
--- a/src/bin/pg_upgrade/dump.c
+++ b/src/bin/pg_upgrade/dump.c
@@ -53,9 +53,10 @@ generate_old_dump(void)
 
 		parallel_exec_prog(log_file_name, NULL,
 						   "\"%s/pg_dump\" %s --schema-only --quote-all-identifiers "
-						   "--binary-upgrade --format=custom %s --file=\"%s/%s\" %s",
+						   "--binary-upgrade --format=custom %s %s --file=\"%s/%s\" %s",
 						   new_cluster.bindir, cluster_conn_opts(&old_cluster),
 						   log_opts.verbose ? "--verbose" : "",
+						   user_opts.preserve_subscriptions ? "--preserve-subscription-state" : "",
 						   log_opts.dumpdir,
 						   sql_file_name, escaped_connstr.data);
 
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 12a97f84e2..9ea25dec70 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -42,6 +42,7 @@ tests += {
     'tests': [
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
+      't/003_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index 8869b6b60d..afed9ac5ce 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -57,6 +57,7 @@ parseCommandLine(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"clone", no_argument, NULL, 1},
 		{"copy", no_argument, NULL, 2},
+		{"preserve-subscription-state", no_argument, NULL, 3},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -199,6 +200,10 @@ parseCommandLine(int argc, char *argv[])
 				user_opts.transfer_mode = TRANSFER_MODE_COPY;
 				break;
 
+			case 3:
+				user_opts.preserve_subscriptions = true;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
 						os_info.progname);
@@ -289,6 +294,7 @@ usage(void)
 	printf(_("  -V, --version                 display version information, then exit\n"));
 	printf(_("  --clone                       clone instead of copying files to new cluster\n"));
 	printf(_("  --copy                        copy files to new cluster (default)\n"));
+	printf(_("  --preserve-subscription-state preserve the subscription state fully\n"));
 	printf(_("  -?, --help                    show this help, then exit\n"));
 	printf(_("\n"
 			 "Before running pg_upgrade you must:\n"
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 3eea0139c7..131fd9a56e 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -304,6 +304,7 @@ typedef struct
 	transferMode transfer_mode; /* copy files or link them? */
 	int			jobs;			/* number of processes/threads to use */
 	char	   *socketdir;		/* directory to use for Unix sockets */
+	bool		preserve_subscriptions; /* fully transfer subscription state */
 } UserOpts;
 
 typedef struct
diff --git a/src/bin/pg_upgrade/t/003_subscription.pl b/src/bin/pg_upgrade/t/003_subscription.pl
new file mode 100644
index 0000000000..053077150c
--- /dev/null
+++ b/src/bin/pg_upgrade/t/003_subscription.pl
@@ -0,0 +1,220 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use Cwd qw(abs_path);
+use File::Basename qw(dirname);
+use File::Compare;
+use File::Find qw(find);
+use File::Path qw(rmtree);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::AdjustUpgrade;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $bindir = $new_sub->config_data('--bindir');
+
+sub insert_line
+{
+	my $payload = shift;
+
+	foreach("t1", "t2")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("t1", "t2")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line('before initial sync');
+
+# Setup logical replication, replicating only 1 table
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub FOR TABLE t1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$connstr' PUBLICATION pub");
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'sub');
+
+# Check that pg_upgrade refuses to run if there's a subscription without a valid
+# remote_lsn.
+#
+# Replication origin's remote_lsn isn't set if no data is replicated after the
+# initial sync.
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+		'--preserve-subscription-state',
+		'--check',
+	],
+	'run of pg_upgrade --check for old instance with invalid remote_lsn');
+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ not removed after pg_upgrade failure");
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Make sure the replication origin is set
+insert_line('after initial sync');
+$old_sub->wait_for_subscription_sync($publisher, 'sub');
+
+my $result = $old_sub->safe_psql('postgres',
+    "SELECT COUNT(*) FROM pg_subscription_rel WHERE srsubstate != 'r'");
+is ($result, qq(0), "All tables in pg_subscription_rel should be in ready state");
+
+# Check the number of rows for each table on each server
+$result = $publisher->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(2), "Table t1 should have 2 rows on the publisher");
+$result = $publisher->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(2), "Table t2 should have 2 rows on the publisher");
+$result = $old_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(2), "Table t1 should have 2 rows on the old subscriber");
+$result = $old_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(0), "Table t2 should have 0 rows on the old subscriber");
+
+# Check that pg_upgrade refuses to run if there's a subscription with tables in
+# a state different than 'r' (ready).
+$old_sub->safe_psql('postgres',
+    "ALTER SUBSCRIPTION sub DISABLE");
+$old_sub->safe_psql('postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'i' WHERE srsubstate = 'r'");
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+		'--preserve-subscription-state',
+		'--check',
+	],
+	'run of pg_upgrade --check for old instance with incorrect sub rel');
+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ not removed after pg_upgrade failure");
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Check that pg_upgrade doesn't detect any problem once all the subscription's
+# relation are in 'r' (ready) state.
+$old_sub->safe_psql('postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'r' WHERE srsubstate = 'i'");
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+		'--preserve-subscription-state',
+		'--check',
+	],
+	'run of pg_upgrade --check for old instance with correct sub rel');
+
+# Stop the old subscriber, insert a row in each table while it's down and add
+# t2 to the publication
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+insert_line('while old_sub is down');
+
+# Run pg_upgrade
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+		'--preserve-subscription-state',
+	],
+	'run of pg_upgrade for new sub');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade success");
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub ADD TABLE t2");
+
+$new_sub->start;
+
+# Subscription relations and replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+    "SELECT count(*) FROM pg_subscription_rel");
+is ($result, qq(1), "There should be 1 row in pg_subscription_rel");
+
+$result = $new_sub->safe_psql('postgres',
+    "SELECT remote_lsn FROM pg_replication_origin_status");
+is ($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# There should be no new replicated rows before enabling the subscription
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(2), "Table t1 should still have 2 rows on the new subscriber");
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(0), "Table t2 should still have 0 rows on the new subscriber");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION sub ENABLE");
+
+$publisher->wait_for_catchup('sub');
+
+# Rows on t1 should have been replicated, while nothing should happen for t2
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(3), "Table t1 should now have 3 rows on the new subscriber");
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(0), "Table t2 should still have 0 rows on the new subscriber");
+
+# Refresh the subscription, only the missing row on t2 should be replicated
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'sub');
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(3), "Table t1 should still have 3 rows on the new subscriber");
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(3), "Table t2 should now have 3 rows on the new subscriber");
+
+done_testing();
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index cc7b32b279..0ec85ceda2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -4028,7 +4028,8 @@ typedef enum AlterSubscriptionType
 	ALTER_SUBSCRIPTION_DROP_PUBLICATION,
 	ALTER_SUBSCRIPTION_REFRESH,
 	ALTER_SUBSCRIPTION_ENABLED,
-	ALTER_SUBSCRIPTION_SKIP
+	ALTER_SUBSCRIPTION_SKIP,
+	ALTER_SUBSCRIPTION_ADD_TABLE
 } AlterSubscriptionType;
 
 typedef struct AlterSubscriptionStmt
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b4058b88c3..ad13521447 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2647,6 +2647,7 @@ SubqueryScan
 SubqueryScanPath
 SubqueryScanState
 SubqueryScanStatus
+SubRelInfo
 SubscriptExecSetup
 SubscriptExecSteps
 SubscriptRoutines
-- 
2.37.0

#45Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Julien Rouhaud (#44)
RE: pg_upgrade and logical replication

Dear Julien,

Thank you for updating the patch! Followings are my comments.

01. documentation

In this page steps to upgrade server with pg_upgrade is aligned. Should we write
down about subscriber? IIUC, it is sufficient to just add to "Run pg_upgrade",
like "Apart from streaming replication standby, subscriber node can be upgrade
via pg_upgrade. At that time we strongly recommend to use --preserve-subscription-state".

02. AlterSubscription

I agreed that oid must be preserved between nodes, but I'm still afraid that
given oid is unconditionally trusted and added to pg_subscription_rel.
I think we can check the existenec of the relation via SearchSysCache1(RELOID,
ObjectIdGetDatum(relid)). Of cource the check is optional, so it should be
executed only when USE_ASSERT_CHECKING is on. Thought?

03. main

Currently --preserve-subscription-state and --no-subscriptions can be used
together, but the situation is quite unnatural. Shouldn't we exclude them?

04. getSubscriptionTables

```
+ SubRelInfo *rels = NULL;
```

The variable is used only inside the loop, so the definition should be also moved.

05. getSubscriptionTables

```
+ nrels = atooid(PQgetvalue(res, i, i_nrels));
```

atoi() should be used instead of atooid().

06. getSubscriptionTables

```
+                       subinfo = findSubscriptionByOid(cur_srsubid);
+
+                       nrels = atooid(PQgetvalue(res, i, i_nrels));
+                       rels = pg_malloc(nrels * sizeof(SubRelInfo));
+
+                       subinfo->subrels = rels;
+                       subinfo->nrels = nrels;
```

Maybe it never occurs, but findSubscriptionByOid() can return NULL. At that time
accesses to their attributes will lead the Segfault. Some handling is needed.

07. dumpSubscription

Hmm, SubRelInfos are still dumped at the dumpSubscription(). I think this style
breaks the manner of pg_dump. I think another dump function is needed. Please
see dumpPublicationTable() and dumpPublicationNamespace(). If you have a reason
to use the style, some comments to describe it is needed.

08. _SubRelInfo

If you will address above comment, DumpableObject must be added as new attribute.

09. check_for_subscription_state

```
+                       for (int i = 0; i < ntup; i++)
+                       {
+                               is_error = true;
+                               pg_log(PG_WARNING,
+                                          "\nWARNING:  subscription \"%s\" has an invalid remote_lsn",
+                                          PQgetvalue(res, 0, 0));
+                       }
```

The second argument should be i to report the name of subscription more than 2.

10. 003_subscription.pl

```
$old_sub->wait_for_subscription_sync($publisher, 'sub');

my $result = $old_sub->safe_psql('postgres',
"SELECT COUNT(*) FROM pg_subscription_rel WHERE srsubstate != 'r'");
is ($result, qq(0), "All tables in pg_subscription_rel should be in ready state");
```

I think there is a possibility to cause a timing issue, because the SELECT may
be executed before srsubstate is changed from 's' to 'r'. Maybe poll_query_until()
can be used instead.

11. 003_subscription.pl

```
command_ok(
[
'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
'-D', $new_sub->data_dir, '-b', $bindir,
'-B', $bindir, '-s', $new_sub->host,
'-p', $old_sub->port, '-P', $new_sub->port,
$mode,
'--preserve-subscription-state',
'--check',
],
'run of pg_upgrade --check for old instance with correct sub rel');
```

Missing check of pg_upgrade_output.d?

And maybe you missed to run pgperltidy.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

#46Peter Smith
smithpb2250@gmail.com
In reply to: Julien Rouhaud (#44)
Re: pg_upgrade and logical replication

Here are some review comments for the v5-0001 patch code.

======
General

1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

I was a bit confused by this relation 'state' mentioned in multiple
places. IIUC the pg_upgrade logic is going to reject anything with a
non-READY (not 'r') state anyhow, so what is the point of having all
the extra grammar/parse_subscription_options etc to handle setting the
state when only possible value must be 'r'?

~~~

2. state V relstate

I still feel code readbility suffers a bit by calling some fields/vars
a generic 'state' instead of the more descriptive 'relstate'. Maybe
it's just me.

Previously commented same (see [1]My previous v4 code review - /messages/by-id/CAHut+PuThBY=MSYHRgUa6iv6tyCmnqU78itZ+f4rMM2b124vqQ@mail.gmail.com#3, #4, #5)

======
doc/src/sgml/ref/pgupgrade.sgml

3.
+       <para>
+        Fully preserve the logical subscription state if any.  That includes
+        the underlying replication origin with their remote LSN and the list of
+        relations in each subscription so that replication can be simply
+        resumed if the subscriptions are reactivated.
+       </para>

I think the "if any" part is not necessary. If you remove those words,
then the rest of the sentence can be simplified.

SUGGESTION
Fully preserve the logical subscription state, which includes the
underlying replication origin's remote LSN, and the list of relations
in each subscription. This allows replication to simply resume when
the subscriptions are reactivated.

~~~

4.
+       <para>
+        If this option isn't used, it is up to the user to reactivate the
+        subscriptions in a suitable way; see the subscription part in <xref
+        linkend="pg-dump-notes"/> for more information.
+       </para>

The link still renders strangely as previously reported (see [1]My previous v4 code review - /messages/by-id/CAHut+PuThBY=MSYHRgUa6iv6tyCmnqU78itZ+f4rMM2b124vqQ@mail.gmail.com#2b).

~~~

5.
+       <para>
+        If this option is used and any of the subscription on the old cluster
+        has an unknown <varname>remote_lsn</varname> (0/0), or has any relation
+        in a state different from <literal>r</literal> (ready), the
+        <application>pg_upgrade</application> run will error.
+       </para>

5a.
/subscription/subscriptions/

~

5b
"has any relation in a state different from r" --> "has any relation
with state other than r"

======
src/backend/commands/subscriptioncmds.c

6.
+ if (strlen(state_str) != 1)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid relation state: %s", state_str)));

Is this relation state validation overly simplistic, by only checking
for length 1? Shouldn't this just be asserting the relstate must be
'r'?

======
src/bin/pg_dump/pg_dump.c

7. getSubscriptionTables

+/*
+ * getSubscriptionTables
+ *   get information about the given subscription's relations
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+ SubscriptionInfo *subinfo;
+ SubRelInfo *rels = NULL;
+ PQExpBuffer query;
+ PGresult   *res;
+ int i_srsubid;
+ int i_srrelid;
+ int i_srsubstate;
+ int i_srsublsn;
+ int i_nrels;
+ int i,
+ cur_rel = 0,
+ ntups,
+ last_srsubid = InvalidOid;

Why some above are single int declarations and some are compound int
declarations? Why not make them all consistent?

~

8.
+ appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn,"
+   " count(*) OVER (PARTITION BY srsubid) AS nrels"
+   " FROM pg_subscription_rel"
+   " ORDER BY srsubid");

Should this SQL be schema-qualified like pg_catalog.pg_subscription_rel?

~

9.
+ for (i = 0; i < ntups; i++)
+ {
+ int cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));

Should 'cur_srsubid' be declared Oid to match the atooid?

~~~

10. getSubscriptions

+ if (PQgetisnull(res, i, i_suboriginremotelsn))
+ subinfo[i].suboriginremotelsn = NULL;
+ else
+ subinfo[i].suboriginremotelsn =
+ pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+
+ /*
+ * For now assume there's no relation associated with the
+ * subscription. Later code might update this field and allocate
+ * subrels as needed.
+ */
+ subinfo[i].nrels = 0;

The wording "For now assume there's no" kind of gives an ambiguous
interpretation for this comment. IMO it sounds like this is the
"current" logic but some future PG version may behave differently - I
don't think that is the intended meaning at all.

SUGGESTION.
Here we just initialize nrels to say there are 0 relations associated
with the subscription. If necessary, subsequent logic will update this
field and allocate the subrels.

~~~

11. dumpSubscription

+ for (i = 0; i < subinfo->nrels; i++)
+ {
+ appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
+   "(relid = %u, state = '%c'",
+   qsubname,
+   subinfo->subrels[i].srrelid,
+   subinfo->subrels[i].srsubstate);
+
+ if (subinfo->subrels[i].srsublsn[0] != '\0')
+ appendPQExpBuffer(query, ", LSN = '%s'",
+   subinfo->subrels[i].srsublsn);
+
+ appendPQExpBufferStr(query, ");");
+ }

I previously asked ([1]My previous v4 code review - /messages/by-id/CAHut+PuThBY=MSYHRgUa6iv6tyCmnqU78itZ+f4rMM2b124vqQ@mail.gmail.com#11) about how can this ALTER SUBSCRIPTION
TABLE code happen unless 'preserve_subscriptions' is true, and you
confirmed "It indirectly is, as in that case subinfo->nrels is
guaranteed to be 0. I just tried to keep the code simpler and avoid
too many nested conditions."

~

If you are worried about too many nested conditions then a simple
Assert(dopt->preserve_subscriptions); might be good to have here.

======
src/bin/pg_upgrade/check.c

12. check_and_dump_old_cluster

+ /* PG 10 introduced subscriptions. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1000 &&
+ user_opts.preserve_subscriptions)
+ {
+ check_for_subscription_state(&old_cluster);
+ }

12a.
All the other checks in this function seem to be in decreasing order
of PG version so maybe this check should be moved to follow that same
pattern.

~

12b.
Also won't it be better to give some error or notice of some kind if
the option/version are incompatible? I think this was mentioned in a
previous review.

e.g.

if (user_opts.preserve_subscriptions)
{
if (GET_MAJOR_VERSION(old_cluster.major_version) < 1000)
<pg_log or pg_fatal goes here...>;
check_for_subscription_state(&old_cluster);
}

~~~

13. check_for_subscription_state

+ for (int i = 0; i < ntup; i++)
+ {
+ is_error = true;
+ pg_log(PG_WARNING,
+    "\nWARNING:  subscription \"%s\" has an invalid remote_lsn",
+    PQgetvalue(res, 0, 0));
+ }

13a.
This WARNING does not mention the database, but a similar warning
later about the non-ready state does mention the database. Probably
they should be consistent.

~

13b.
Something seems amiss. Here the is_error is assigned true; But later
when you test is_error that is for logging the ready-state problem.
Isn't there another missing pg_fatal for this invalid remote_lsn case?

======
src/bin/pg_upgrade/option.c

14. usage

+ printf(_(" --preserve-subscription-state preserve the subscription
state fully\n"));

Why say "fully"? How is "preserve the subscription state fully"
different to "preserve the subscription state" from the user's POV?

------
[1]: My previous v4 code review - /messages/by-id/CAHut+PuThBY=MSYHRgUa6iv6tyCmnqU78itZ+f4rMM2b124vqQ@mail.gmail.com
/messages/by-id/CAHut+PuThBY=MSYHRgUa6iv6tyCmnqU78itZ+f4rMM2b124vqQ@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

#47Peter Smith
smithpb2250@gmail.com
In reply to: Julien Rouhaud (#42)
Re: pg_upgrade and logical replication

On Mon, Apr 24, 2023 at 4:19 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Hi,

On Thu, Apr 13, 2023 at 03:26:56PM +1000, Peter Smith wrote:

1.
All the comments look alike, so it is hard to know what is going on.
If each of the main test parts could be highlighted then the test code
would be easier to read IMO.

Something like below:
[...]

I added a bit more comments about what's is being tested. I'm not sure that a
big TEST CASE prefix is necessary, as it's not really multiple separated test
cases and other stuff can be tested in between. Also AFAICT no other TAP test
current needs this kind of banner, even if they're testing more complex
scenario.

Hmm, I think there are plenty of examples of subscription TAP tests
having some kind of highlighted comments as suggested, for better
readability.

e.g. See src/test/subscription
t/014_binary.pl
t/015_stream.pl
t/016_stream_subxact.pl
t/018_stream_subxact_abort.pl
t/021_twophase.pl
t/022_twophase_cascade.pl
t/023_twophase_stream.pl
t/028_row_filter.pl
t/030_origin.pl
t/031_column_list.pl
t/032_subscribe_use_index.pl

A simple #################### to separate the main test parts is all
that is needed.

4b.
All these messages like "Table t1 should still have 2 rows on the new
subscriber" don't seem very helpful. e.g. They are not saying anything
about WHAT this is testing or WHY it should still have 2 rows.

I don't think that those messages are supposed to say what or why something is
tested, just give a quick context / reference on the test in case it's broken.
The comments are there to explain in more details what is tested and/or why.

But, why can’t they do both? They can be a quick reference *and* at
the same time give some more meaning to the error log. Otherwise,
these messages might as well just say ‘ref1’, ‘ref2’, ‘ref3’...

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#48vignesh C
vignesh21@gmail.com
In reply to: Julien Rouhaud (#44)
Re: pg_upgrade and logical replication

On Mon, 24 Apr 2023 at 12:52, Julien Rouhaud <rjuju123@gmail.com> wrote:

Hi,

On Tue, Apr 18, 2023 at 01:40:51AM +0000, Hayato Kuroda (Fujitsu) wrote:

I found a cfbot failure on macOS [1]. According to the log,
"SELECT count(*) FROM t2" was executed before synchronization was done.

```
[09:24:21.018](0.132s) not ok 18 - Table t2 should now have 3 rows on the new subscriber
```

With the patch present, wait_for_catchup() is executed after REFRESH, but
it may not be sufficient because it does not check pg_subscription_rel.
wait_for_subscription_sync() seems better for the purpose.

Fixed, thanks!

I had a high level look at the patch, few comments:
1) New ereport style can be used by removing the brackets around errcode:
1.a)
+                               ereport(ERROR,
+
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                                                errmsg("invalid
relation identifier used: %s", rel_str)));
+                       }
1.b)
+                       if (strlen(state_str) != 1)
+                               ereport(ERROR,
+
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                                                errmsg("invalid
relation state: %s", state_str)));
1.c)
+               case ALTER_SUBSCRIPTION_ADD_TABLE:
+                       {
+                               if (!IsBinaryUpgrade)
+                                       ereport(ERROR,
+
(errcode(ERRCODE_SYNTAX_ERROR)),
+                                                       errmsg("ALTER
SUBSCRIPTION ... ADD TABLE is not supported"));
2) Since this is a single statement, the braces are not required in this case:
2.a)
+       if (!fout->dopt->binary_upgrade ||
!fout->dopt->preserve_subscriptions ||
+               fout->remoteVersion < 100000)
+       {
+               return;
+       }
2.b) Similarly here too
+       if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+               subinfo->suboriginremotelsn)
+       {
+               appendPQExpBuffer(query, ", lsn = '%s'",
subinfo->suboriginremotelsn);
+       }
3) Since this comment is a very short comment, this can be changed
into a single line comment:
+       /*
+        * Get subscription relation fields.
+        */
4) Since cur_rel will be initialized in "if (cur_srsubid !=
last_srsubid)", it need not be initialized here:
+       int                     i,
+                               cur_rel = 0,
+                               ntups,
5) SubRelInfo should be placed above SubRemoveRels:
+++ b/src/tools/pgindent/typedefs.list
@@ -2647,6 +2647,7 @@ SubqueryScan
 SubqueryScanPath
 SubqueryScanState
 SubqueryScanStatus
+SubRelInfo
 SubscriptExecSetup

Regards,
Vignesh

#49vignesh C
vignesh21@gmail.com
In reply to: Julien Rouhaud (#44)
Re: pg_upgrade and logical replication

On Mon, 24 Apr 2023 at 12:52, Julien Rouhaud <rjuju123@gmail.com> wrote:

Hi,

On Tue, Apr 18, 2023 at 01:40:51AM +0000, Hayato Kuroda (Fujitsu) wrote:

I found a cfbot failure on macOS [1]. According to the log,
"SELECT count(*) FROM t2" was executed before synchronization was done.

```
[09:24:21.018](0.132s) not ok 18 - Table t2 should now have 3 rows on the new subscriber
```

With the patch present, wait_for_catchup() is executed after REFRESH, but
it may not be sufficient because it does not check pg_subscription_rel.
wait_for_subscription_sync() seems better for the purpose.

Fixed, thanks!

v5 attached with all previously mentioned fixes.

Few comments:
1) Should we document this command:
+               case ALTER_SUBSCRIPTION_ADD_TABLE:
+                       {
+                               if (!IsBinaryUpgrade)
+                                       ereport(ERROR,
+
(errcode(ERRCODE_SYNTAX_ERROR)),
+                                                       errmsg("ALTER
SUBSCRIPTION ... ADD TABLE is not supported"));
+
+                               supported_opts = SUBOPT_RELID |
SUBOPT_STATE | SUBOPT_LSN;
+                               parse_subscription_options(pstate,
stmt->options,
+
            supported_opts, &opts);
+
+                               /* relid and state should always be provided. */
+                               Assert(IsSet(opts.specified_opts,
SUBOPT_RELID));
+                               Assert(IsSet(opts.specified_opts,
SUBOPT_STATE));
+
+                               AddSubscriptionRelState(subid,
opts.relid, opts.state,
+
         opts.lsn);
+

Should we document something like:
This command is for use by in-place upgrade utilities. Its use for
other purposes is not recommended or supported. The behavior of the
option may change in future releases without notice.

2) Similarly in pg_dump too:
@@ -431,6 +431,7 @@ main(int argc, char **argv)
                {"table-and-children", required_argument, NULL, 12},
                {"exclude-table-and-children", required_argument, NULL, 13},
                {"exclude-table-data-and-children", required_argument,
NULL, 14},
+               {"preserve-subscription-state", no_argument,
&dopt.preserve_subscriptions, 1},

Should we document something like:
This command is for use by in-place upgrade utilities. Its use for
other purposes is not recommended or supported. The behavior of the
option may change in future releases without notice.

3) This same error is possible for ready state table but with invalid
remote_lsn, should we include this too in the error message:
+       if (is_error)
+               pg_fatal("--preserve-subscription-state is incompatible with "
+                                "subscription relations in non-ready state");
+
+       check_ok();
+}

Regards,
Vignesh

#50Michael Paquier
michael@paquier.xyz
In reply to: Peter Smith (#46)
Re: pg_upgrade and logical replication

On Wed, May 10, 2023 at 05:59:24PM +1000, Peter Smith wrote:

1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

I was a bit confused by this relation 'state' mentioned in multiple
places. IIUC the pg_upgrade logic is going to reject anything with a
non-READY (not 'r') state anyhow, so what is the point of having all
the extra grammar/parse_subscription_options etc to handle setting the
state when only possible value must be 'r'?

We are just talking about the handling of an extra DefElem in an
extensible grammar pattern, so adding the state field does not
represent much maintenance work. I'm OK with the addition of this
field in the data set dumped, FWIW, on the ground that it can be
useful for debugging purposes when looking at --binary-upgrade dumps,
and because we aim at copying catalog contents from one cluster to
another.

Anyway, I am not convinced that we have any need for a parse-able
grammar at all, because anything that's presented on this thread is
aimed at being used only for the internal purpose of an upgrade in a
--binary-upgrade dump with a direct catalog copy in mind, and having a
grammar would encourage abuses of it outside of this context. I think
that we should aim for simpler than what's proposed by the patch,
actually, with either a single SQL function à-la-binary_upgrade() that
adds the contents of a relation. Or we can be crazier and just create
INSERT queries for pg_subscription_rel to provide an exact copy of the
catalog contents. A SQL function would be more consistent with other
objects types that use similar tricks, see
binary_upgrade_create_empty_extension() that does something similar
for some pg_extension records. So, this function would require in
input 4 arguments:
- The subscription name or OID.
- The relation OID.
- Its LSN.
- Its sync state.

2. state V relstate

I still feel code readbility suffers a bit by calling some fields/vars
a generic 'state' instead of the more descriptive 'relstate'. Maybe
it's just me.

Previously commented same (see [1]#3, #4, #5)

Agreed to be more careful with the naming here.
--
Michael

#51Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#50)
Re: pg_upgrade and logical replication

On Wed, Jul 19, 2023 at 12:47 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, May 10, 2023 at 05:59:24PM +1000, Peter Smith wrote:

1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

I was a bit confused by this relation 'state' mentioned in multiple
places. IIUC the pg_upgrade logic is going to reject anything with a
non-READY (not 'r') state anyhow, so what is the point of having all
the extra grammar/parse_subscription_options etc to handle setting the
state when only possible value must be 'r'?

We are just talking about the handling of an extra DefElem in an
extensible grammar pattern, so adding the state field does not
represent much maintenance work. I'm OK with the addition of this
field in the data set dumped, FWIW, on the ground that it can be
useful for debugging purposes when looking at --binary-upgrade dumps,
and because we aim at copying catalog contents from one cluster to
another.

Anyway, I am not convinced that we have any need for a parse-able
grammar at all, because anything that's presented on this thread is
aimed at being used only for the internal purpose of an upgrade in a
--binary-upgrade dump with a direct catalog copy in mind, and having a
grammar would encourage abuses of it outside of this context. I think
that we should aim for simpler than what's proposed by the patch,
actually, with either a single SQL function à-la-binary_upgrade() that
adds the contents of a relation. Or we can be crazier and just create
INSERT queries for pg_subscription_rel to provide an exact copy of the
catalog contents. A SQL function would be more consistent with other
objects types that use similar tricks, see
binary_upgrade_create_empty_extension() that does something similar
for some pg_extension records. So, this function would require in
input 4 arguments:
- The subscription name or OID.
- The relation OID.
- Its LSN.
- Its sync state.

+1 for doing it via function (something like
binary_upgrade_create_sub_rel_state). We already have the internal
function AddSubscriptionRelState() that can do the core work.

Like the publisher-side upgrade patch [1]/messages/by-id/TYAPR01MB58664C81887B3AF2EB6B16E3F5939@TYAPR01MB5866.jpnprd01.prod.outlook.com, I think we should allow
upgrading subscriptions by default instead with some flag like
--preserve-subscription-state. If required, we can introduce --exclude
option for upgrade. Having it just for pg_dump sounds reasonable to
me.

[1]: /messages/by-id/TYAPR01MB58664C81887B3AF2EB6B16E3F5939@TYAPR01MB5866.jpnprd01.prod.outlook.com

--
With Regards,
Amit Kapila.

#52Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#45)
Re: pg_upgrade and logical replication

On Thu, Apr 27, 2023 at 1:18 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

03. main

Currently --preserve-subscription-state and --no-subscriptions can be used
together, but the situation is quite unnatural. Shouldn't we exclude them?

Right, that makes sense to me.

--
With Regards,
Amit Kapila.

#53Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#51)
Re: pg_upgrade and logical replication

On Mon, Sep 04, 2023 at 11:51:14AM +0530, Amit Kapila wrote:

+1 for doing it via function (something like
binary_upgrade_create_sub_rel_state). We already have the internal
function AddSubscriptionRelState() that can do the core work.

It is one of these patches that I have let aside for too long, and it
solves a use-case of its own. I think that I could hack that pretty
quickly given that Julien has done a bunch of the ground work. Would
you agree with that?

Like the publisher-side upgrade patch [1], I think we should allow
upgrading subscriptions by default instead with some flag like
--preserve-subscription-state. If required, we can introduce --exclude
option for upgrade. Having it just for pg_dump sounds reasonable to
me.

[1] - /messages/by-id/TYAPR01MB58664C81887B3AF2EB6B16E3F5939@TYAPR01MB5866.jpnprd01.prod.outlook.com

In the interface of the publisher for pg_upgrade agreed on and set in
stone? I certainly agree to have a consistent upgrade experience for
the two sides of logical replication, publications and subscriptions.
Also, I'd rather have a filtering option at the same time as the
upgrade option to give more control to users from the start.
--
Michael

#54Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#51)
Re: pg_upgrade and logical replication

On Mon, Sep 4, 2023 at 11:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jul 19, 2023 at 12:47 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, May 10, 2023 at 05:59:24PM +1000, Peter Smith wrote:

1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

I was a bit confused by this relation 'state' mentioned in multiple
places. IIUC the pg_upgrade logic is going to reject anything with a
non-READY (not 'r') state anyhow, so what is the point of having all
the extra grammar/parse_subscription_options etc to handle setting the
state when only possible value must be 'r'?

We are just talking about the handling of an extra DefElem in an
extensible grammar pattern, so adding the state field does not
represent much maintenance work. I'm OK with the addition of this
field in the data set dumped, FWIW, on the ground that it can be
useful for debugging purposes when looking at --binary-upgrade dumps,
and because we aim at copying catalog contents from one cluster to
another.

Anyway, I am not convinced that we have any need for a parse-able
grammar at all, because anything that's presented on this thread is
aimed at being used only for the internal purpose of an upgrade in a
--binary-upgrade dump with a direct catalog copy in mind, and having a
grammar would encourage abuses of it outside of this context. I think
that we should aim for simpler than what's proposed by the patch,
actually, with either a single SQL function à-la-binary_upgrade() that
adds the contents of a relation. Or we can be crazier and just create
INSERT queries for pg_subscription_rel to provide an exact copy of the
catalog contents. A SQL function would be more consistent with other
objects types that use similar tricks, see
binary_upgrade_create_empty_extension() that does something similar
for some pg_extension records. So, this function would require in
input 4 arguments:
- The subscription name or OID.
- The relation OID.
- Its LSN.
- Its sync state.

+1 for doing it via function (something like
binary_upgrade_create_sub_rel_state). We already have the internal
function AddSubscriptionRelState() that can do the core work.

One more related point:
@@ -4814,9 +4923,31 @@ dumpSubscription(Archive *fout, const
SubscriptionInfo *subinfo)
if (strcmp(subinfo->subpasswordrequired, "t") != 0)
appendPQExpBuffer(query, ", password_required = false");

+ if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+ subinfo->suboriginremotelsn)
+ {
+ appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
+ }

Even during Create Subscription, we can use an existing function
(pg_replication_origin_advance()) or a set of functions to advance the
origin instead of introducing a new option.

--
With Regards,
Amit Kapila.

#55Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#53)
Re: pg_upgrade and logical replication

On Mon, Sep 4, 2023 at 12:15 PM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Sep 04, 2023 at 11:51:14AM +0530, Amit Kapila wrote:

+1 for doing it via function (something like
binary_upgrade_create_sub_rel_state). We already have the internal
function AddSubscriptionRelState() that can do the core work.

It is one of these patches that I have let aside for too long, and it
solves a use-case of its own. I think that I could hack that pretty
quickly given that Julien has done a bunch of the ground work. Would
you agree with that?

Yeah, I agree that could be hacked quickly but note I haven't reviewed
in detail if there are other design issues in this patch. Note that we
thought first to support the upgrade of the publisher node, otherwise,
immediately after upgrading the subscriber and publisher, the
subscriptions won't work and start giving errors as they are dependent
on slots in the publisher. One other point that needs some thought is
that the LSN positions we are going to copy in the catalog may no
longer be valid after the upgrade (of the publisher) because we reset
WAL. Does that need some special consideration or are we okay with
that in all cases? As of now, things are quite safe as documented in
pg_dump doc page that it will be the user's responsibility to set up
replication after dump/restore. I think it would be really helpful if
you could share your thoughts on the publisher-side matter as we are
facing a few tricky questions to be answered. For example, see a new
thread [1]/messages/by-id/CAA4eK1LV3+76CSOAk0h8Kv0AKb-OETsJHe6Sq6172-7DZXf0Qg@mail.gmail.com.

Like the publisher-side upgrade patch [1], I think we should allow
upgrading subscriptions by default instead with some flag like
--preserve-subscription-state. If required, we can introduce --exclude
option for upgrade. Having it just for pg_dump sounds reasonable to
me.

[1] - /messages/by-id/TYAPR01MB58664C81887B3AF2EB6B16E3F5939@TYAPR01MB5866.jpnprd01.prod.outlook.com

In the interface of the publisher for pg_upgrade agreed on and set in
stone? I certainly agree to have a consistent upgrade experience for
the two sides of logical replication, publications and subscriptions.
Also, I'd rather have a filtering option at the same time as the
upgrade option to give more control to users from the start.

The point raised by Jonathan for not having an option for pg_upgrade
is that it will be easy for users, otherwise, users always need to
enable this option. Consider a replication setup, wouldn't users want
by default it to be upgraded? Asking them to do that via an option
would be an inconvenience. So, that was the reason, we wanted to have
an --exclude option and by default allow slots to be upgraded. I think
the same theory applies here.

[1]: /messages/by-id/CAA4eK1LV3+76CSOAk0h8Kv0AKb-OETsJHe6Sq6172-7DZXf0Qg@mail.gmail.com

--
With Regards,
Amit Kapila.

#56vignesh C
vignesh21@gmail.com
In reply to: Michael Paquier (#50)
1 attachment(s)
Re: pg_upgrade and logical replication

On Wed, 19 Jul 2023 at 12:47, Michael Paquier <michael@paquier.xyz> wrote:

On Wed, May 10, 2023 at 05:59:24PM +1000, Peter Smith wrote:

1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

I was a bit confused by this relation 'state' mentioned in multiple
places. IIUC the pg_upgrade logic is going to reject anything with a
non-READY (not 'r') state anyhow, so what is the point of having all
the extra grammar/parse_subscription_options etc to handle setting the
state when only possible value must be 'r'?

We are just talking about the handling of an extra DefElem in an
extensible grammar pattern, so adding the state field does not
represent much maintenance work. I'm OK with the addition of this
field in the data set dumped, FWIW, on the ground that it can be
useful for debugging purposes when looking at --binary-upgrade dumps,
and because we aim at copying catalog contents from one cluster to
another.

Anyway, I am not convinced that we have any need for a parse-able
grammar at all, because anything that's presented on this thread is
aimed at being used only for the internal purpose of an upgrade in a
--binary-upgrade dump with a direct catalog copy in mind, and having a
grammar would encourage abuses of it outside of this context. I think
that we should aim for simpler than what's proposed by the patch,
actually, with either a single SQL function à-la-binary_upgrade() that
adds the contents of a relation. Or we can be crazier and just create
INSERT queries for pg_subscription_rel to provide an exact copy of the
catalog contents. A SQL function would be more consistent with other
objects types that use similar tricks, see
binary_upgrade_create_empty_extension() that does something similar
for some pg_extension records. So, this function would require in
input 4 arguments:
- The subscription name or OID.
- The relation OID.
- Its LSN.
- Its sync state.

Added a SQL function to handle the insertion and removed the "ALTER
SUBSCRIPTION ... ADD TABLE" command that was added.
Attached patch has the changes for the same.

Regards,
Vignesh

Attachments:

v6-0001-Optionally-preserve-the-full-subscription-s-state.patchtext/x-patch; charset=US-ASCII; name=v6-0001-Optionally-preserve-the-full-subscription-s-state.patchDownload
From ce0e041bf120f3615ec7a02187ce27e9922688d2 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Wed, 6 Sep 2023 10:07:42 +0530
Subject: [PATCH v6] Optionally preserve the full subscription's state during
 pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

Similarly, the subscription's replication origin are needed to ensure
that we don't replicate anything twice.

To fix this problem, this patch teaches pg_dump in binary upgrade mode to
restore the content of pg_subscription_rel from the old cluster by using
binary_upgrade_create_sub_rel_state SQL function, and also provides an
additional LSN parameter for CREATE SUBSCRIPTION to restore the underlying
replication origin remote LSN.  The new binary_upgrade_create_sub_rel_state
SQL function and the new LSN parameter are not exposed to users and only
accepted in binary upgrade mode.

The new SQL binary_upgrade_create_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_create_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is optional, and
defaults to NULL/InvalidXLogRecPtr if not provided. pg_dump will retrieve these
values(subname, relid, state and sublsn) from the old cluster.

This mode is optional and not enabled by default.  A new
--preserve-subscription-state option is added to pg_upgrade to use it.  For
now, pg_upgrade will check that all the subscription have a valid replication
origin remote_lsn, and that all underlying relations are in 'r' (ready) state,
and will error out if that's not the case, logging the reason for the failure.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml          |  23 +++
 src/backend/catalog/pg_subscription.c    |  64 +++++++
 src/backend/commands/subscriptioncmds.c  |  10 +-
 src/bin/pg_dump/common.c                 |  22 +++
 src/bin/pg_dump/pg_backup.h              |   2 +
 src/bin/pg_dump/pg_dump.c                | 128 ++++++++++++-
 src/bin/pg_dump/pg_dump.h                |  15 ++
 src/bin/pg_upgrade/check.c               |  79 ++++++++
 src/bin/pg_upgrade/dump.c                |   3 +-
 src/bin/pg_upgrade/meson.build           |   1 +
 src/bin/pg_upgrade/option.c              |   6 +
 src/bin/pg_upgrade/pg_upgrade.h          |   1 +
 src/bin/pg_upgrade/t/003_subscription.pl | 220 +++++++++++++++++++++++
 src/include/catalog/pg_proc.dat          |   7 +
 src/tools/pgindent/typedefs.list         |   1 +
 15 files changed, 578 insertions(+), 4 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/003_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 7816b4c685..6af790c986 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -240,6 +240,29 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--preserve-subscription-state</option></term>
+      <listitem>
+       <para>
+        Fully preserve the logical subscription state if any.  That includes
+        the underlying replication origin with their remote LSN and the list of
+        relations in each subscription so that replication can be simply
+        resumed if the subscriptions are reactivated.
+       </para>
+       <para>
+        If this option isn't used, it is up to the user to reactivate the
+        subscriptions in a suitable way; see the subscription part in <xref
+        linkend="pg-dump-notes"/> for more information.
+       </para>
+       <para>
+        If this option is used and any of the subscription on the old cluster
+        has an unknown <varname>remote_lsn</varname> (0/0), or has any relation
+        in a state different from <literal>r</literal> (ready), the
+        <application>pg_upgrade</application> run will error.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-?</option></term>
       <term><option>--help</option></term>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d07f88ce28..fedf838d04 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -269,6 +269,70 @@ AddSubscriptionRelState(Oid subid, Oid relid, char state,
 	table_close(rel, NoLock);
 }
 
+/*
+ * binary_upgrade_create_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * table.
+ */
+Datum
+binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char		*subname;
+	Oid			relid;
+	char		state;
+	XLogRecPtr  sublsn;
+
+	if (!IsBinaryUpgrade)
+		ereport(ERROR,
+				errcode(ERRCODE_SYNTAX_ERROR),
+				errmsg("binary_upgrade_create_sub_rel_state can only be called when server is in binary upgrade mode"));
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) ||
+		PG_ARGISNULL(1) ||
+		PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_create_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	state = PG_GETARG_CHAR(2);
+
+	if (PG_ARGISNULL(3))
+		sublsn = InvalidXLogRecPtr;
+	else
+		sublsn = PG_GETARG_LSN(3);
+
+	if (!OidIsValid(relid))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					errmsg("invalid relation identifier used: %u", relid));
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				 errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, state, sublsn);
+
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
 /*
  * Update the state of a subscription table.
  */
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 34d881fd94..aa89581010 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -580,6 +580,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	bits32		supported_opts;
 	SubOpts		opts = {0};
 	AclResult	aclresult;
+	RepOriginId originid;
 
 	/*
 	 * Parse and check options.
@@ -592,6 +593,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 					  SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
 					  SUBOPT_DISABLE_ON_ERR | SUBOPT_PASSWORD_REQUIRED |
 					  SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN);
+	if (IsBinaryUpgrade)
+		supported_opts |= SUBOPT_LSN;
 	parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
 
 	/*
@@ -720,7 +723,12 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	recordDependencyOnOwner(SubscriptionRelationId, subid, owner);
 
 	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
-	replorigin_create(originname);
+	originid = replorigin_create(originname);
+
+	if (IsBinaryUpgrade && IsSet(opts.specified_opts, SUBOPT_LSN))
+		replorigin_advance(originid, opts.lsn, InvalidXLogRecPtr,
+						   false /* backward */ ,
+						   false /* WAL log */ );
 
 	/*
 	 * Connect to remote side to execute requested commands and fetch table
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index aba780ef4b..8c82657e76 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -200,6 +200,8 @@ typedef struct _dumpOptions
 
 	int			sequence_data;	/* dump sequence data even in schema-only mode */
 	int			do_nothing;
+
+	int		preserve_subscriptions;
 } DumpOptions;
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index cebd2400fd..181a070a3e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -431,6 +431,7 @@ main(int argc, char **argv)
 		{"table-and-children", required_argument, NULL, 12},
 		{"exclude-table-and-children", required_argument, NULL, 13},
 		{"exclude-table-data-and-children", required_argument, NULL, 14},
+		{"preserve-subscription-state", no_argument, &dopt.preserve_subscriptions, 1},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -714,6 +715,10 @@ main(int argc, char **argv)
 	if (dopt.do_nothing && dopt.dump_inserts == 0)
 		pg_fatal("option --on-conflict-do-nothing requires option --inserts, --rows-per-insert, or --column-inserts");
 
+	/* --preserve-subscription-state requires --binary-upgrade */
+	if (dopt.preserve_subscriptions && !dopt.binary_upgrade)
+		pg_fatal("option --preserve-subscription-state requires option --binary-upgrade");
+
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
 
@@ -4568,6 +4573,86 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  get information about the given subscription's relations
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	SubscriptionInfo *subinfo;
+	SubRelInfo *rels = NULL;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i_nrels;
+	int			i,
+				cur_rel = 0,
+				ntups,
+				last_srsubid = InvalidOid;
+
+	if (!fout->dopt->binary_upgrade || !fout->dopt->preserve_subscriptions ||
+		fout->remoteVersion < 100000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn,"
+					  " count(*) OVER (PARTITION BY srsubid) AS nrels"
+					  " FROM pg_subscription_rel"
+					  " ORDER BY srsubid");
+
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get subscription relation fields */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+	i_nrels = PQfnumber(res, "nrels");
+
+	for (i = 0; i < ntups; i++)
+	{
+		int			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+
+		/*
+		 * If we switched to a new subscription, setup the necessary fields in
+		 * the SubscriptionInfo and reset the cur_rel counter.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			int			nrels;
+
+			subinfo = findSubscriptionByOid(cur_srsubid);
+
+			nrels = atooid(PQgetvalue(res, i, i_nrels));
+			rels = pg_malloc(nrels * sizeof(SubRelInfo));
+
+			subinfo->subrels = rels;
+			subinfo->nrels = nrels;
+
+			last_srsubid = cur_srsubid;
+			cur_rel = 0;
+		}
+
+		rels[cur_rel].srrelid = atooid(PQgetvalue(res, i, i_srrelid));
+		rels[cur_rel].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		rels[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		cur_rel++;
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4593,6 +4678,7 @@ getSubscriptions(Archive *fout)
 	int			i_subpublications;
 	int			i_subbinary;
 	int			i_subpasswordrequired;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4647,15 +4733,19 @@ getSubscriptions(Archive *fout)
 	if (fout->remoteVersion >= 160000)
 		appendPQExpBufferStr(query,
 							 " s.suborigin,\n"
-							 " s.subpasswordrequired\n");
+							 " s.subpasswordrequired,\n");
 	else
 		appendPQExpBuffer(query,
 						  " '%s' AS suborigin,\n"
-						  " 't' AS subpasswordrequired\n",
+						  " 't' AS subpasswordrequired,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	appendPQExpBufferStr(query, "o.remote_lsn\n");
+
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n"
+						 "LEFT JOIN pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4681,6 +4771,7 @@ getSubscriptions(Archive *fout)
 	i_subdisableonerr = PQfnumber(res, "subdisableonerr");
 	i_suborigin = PQfnumber(res, "suborigin");
 	i_subpasswordrequired = PQfnumber(res, "subpasswordrequired");
+	i_suboriginremotelsn = PQfnumber(res, "remote_lsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4713,6 +4804,18 @@ getSubscriptions(Archive *fout)
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
 		subinfo[i].subpasswordrequired =
 			pg_strdup(PQgetvalue(res, i, i_subpasswordrequired));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+
+		/*
+		 * For now assume there's no relation associated with the
+		 * subscription. Later code might update this field and allocate
+		 * subrels as needed.
+		 */
+		subinfo[i].nrels = 0;
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4797,9 +4900,29 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 	if (strcmp(subinfo->subpasswordrequired, "t") != 0)
 		appendPQExpBuffer(query, ", password_required = false");
 
+	if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+		subinfo->suboriginremotelsn)
+		appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
+
 	appendPQExpBufferStr(query, ");\n");
 
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		for (i = 0; i < subinfo->nrels; i++)
+		{
+			appendPQExpBuffer(query,
+							  "SELECT binary_upgrade_create_sub_rel_state('%s', %u, '%c'",
+							  subinfo->dobj.name,
+							  subinfo->subrels[i].srrelid,
+							  subinfo->subrels[i].srsubstate);
+
+			if (subinfo->subrels[i].srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", '%s'",
+								  subinfo->subrels[i].srsublsn);
+
+			appendPQExpBufferStr(query, ");\n");
+		}
+
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
 								  .owner = subinfo->rolname,
@@ -4807,6 +4930,7 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 								  .section = SECTION_POST_DATA,
 								  .createStmt = query->data,
 								  .dropStmt = delq->data));
+	}
 
 	if (subinfo->dobj.dump & DUMP_COMPONENT_COMMENT)
 		dumpComment(fout, "SUBSCRIPTION", qsubname,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 9036b13f6a..6718397dfa 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -653,6 +653,16 @@ typedef struct _PublicationSchemaInfo
 	NamespaceInfo *pubschema;
 } PublicationSchemaInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	Oid			srrelid;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  * The SubscriptionInfo struct is used to represent subscription.
  */
@@ -670,6 +680,9 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *subpasswordrequired;
+	char	   *suboriginremotelsn;
+	int			nrels;
+	SubRelInfo *subrels;
 } SubscriptionInfo;
 
 /*
@@ -696,6 +709,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -755,5 +769,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 56e313f562..6d2d272fac 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_state(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -104,6 +105,11 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
+	/* PG 10 introduced subscriptions. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1000 &&
+		user_opts.preserve_subscriptions)
+		check_for_subscription_state(&old_cluster);
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -785,6 +791,79 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_state()
+ *
+ * Verify that all subscriptions have a valid remote_lsn and don't contain
+ * any table in srsubstate different than ready ('r').
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)
+{
+	int			dbnum;
+	bool		is_error = false;
+
+	Assert(user_opts.preserve_subscriptions);
+
+	prep_status("Checking for subscription state");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin_status only once. */
+		if (dbnum == 0)
+		{
+			int			ntup;
+
+			res = executeQueryOrDie(conn,
+									"SELECT s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT JOIN pg_catalog.pg_replication_origin_status os"
+									"  ON os.external_id = 'pg_' || s.oid "
+									"WHERE coalesce(remote_lsn, '0/0') = '0/0'");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				is_error = true;
+				pg_log(PG_WARNING,
+					   "\nWARNING:  subscription \"%s\" has an invalid remote_lsn",
+					   PQgetvalue(res, 0, 0));
+			}
+			PQclear(res);
+		}
+
+		res = executeQueryOrDie(conn,
+								"SELECT count(0) "
+								"FROM pg_catalog.pg_subscription_rel "
+								"WHERE srsubstate != 'r'");
+
+		if (PQntuples(res) != 1)
+			pg_fatal("could not determine the number of non-ready subscription relations");
+
+		if (strcmp(PQgetvalue(res, 0, 0), "0") != 0)
+		{
+			is_error = true;
+			pg_log(PG_WARNING,
+				   "\nWARNING: database \"%s\" has %s subscription "
+				   "relations(s) in non-ready state", active_db->db_name,
+				   PQgetvalue(res, 0, 0));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (is_error)
+		pg_fatal("--preserve-subscription-state is incompatible with "
+				 "subscription relations in non-ready state");
+
+	check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/dump.c b/src/bin/pg_upgrade/dump.c
index 6c8c82dca8..9284576af7 100644
--- a/src/bin/pg_upgrade/dump.c
+++ b/src/bin/pg_upgrade/dump.c
@@ -53,9 +53,10 @@ generate_old_dump(void)
 
 		parallel_exec_prog(log_file_name, NULL,
 						   "\"%s/pg_dump\" %s --schema-only --quote-all-identifiers "
-						   "--binary-upgrade --format=custom %s --file=\"%s/%s\" %s",
+						   "--binary-upgrade --format=custom %s %s --file=\"%s/%s\" %s",
 						   new_cluster.bindir, cluster_conn_opts(&old_cluster),
 						   log_opts.verbose ? "--verbose" : "",
+						   user_opts.preserve_subscriptions ? "--preserve-subscription-state" : "",
 						   log_opts.dumpdir,
 						   sql_file_name, escaped_connstr.data);
 
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 12a97f84e2..9ea25dec70 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -42,6 +42,7 @@ tests += {
     'tests': [
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
+      't/003_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index 640361009e..a42c6defc2 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -57,6 +57,7 @@ parseCommandLine(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"clone", no_argument, NULL, 1},
 		{"copy", no_argument, NULL, 2},
+		{"preserve-subscription-state", no_argument, NULL, 3},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -199,6 +200,10 @@ parseCommandLine(int argc, char *argv[])
 				user_opts.transfer_mode = TRANSFER_MODE_COPY;
 				break;
 
+			case 3:
+				user_opts.preserve_subscriptions = true;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
 						os_info.progname);
@@ -289,6 +294,7 @@ usage(void)
 	printf(_("  -V, --version                 display version information, then exit\n"));
 	printf(_("  --clone                       clone instead of copying files to new cluster\n"));
 	printf(_("  --copy                        copy files to new cluster (default)\n"));
+	printf(_("  --preserve-subscription-state preserve the subscription state fully\n"));
 	printf(_("  -?, --help                    show this help, then exit\n"));
 	printf(_("\n"
 			 "Before running pg_upgrade you must:\n"
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 7afa96716e..f2cae91f69 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -304,6 +304,7 @@ typedef struct
 	transferMode transfer_mode; /* copy files or link them? */
 	int			jobs;			/* number of processes/threads to use */
 	char	   *socketdir;		/* directory to use for Unix sockets */
+	bool		preserve_subscriptions; /* fully transfer subscription state */
 } UserOpts;
 
 typedef struct
diff --git a/src/bin/pg_upgrade/t/003_subscription.pl b/src/bin/pg_upgrade/t/003_subscription.pl
new file mode 100644
index 0000000000..053077150c
--- /dev/null
+++ b/src/bin/pg_upgrade/t/003_subscription.pl
@@ -0,0 +1,220 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use Cwd qw(abs_path);
+use File::Basename qw(dirname);
+use File::Compare;
+use File::Find qw(find);
+use File::Path qw(rmtree);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::AdjustUpgrade;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $bindir = $new_sub->config_data('--bindir');
+
+sub insert_line
+{
+	my $payload = shift;
+
+	foreach("t1", "t2")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("t1", "t2")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line('before initial sync');
+
+# Setup logical replication, replicating only 1 table
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub FOR TABLE t1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$connstr' PUBLICATION pub");
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'sub');
+
+# Check that pg_upgrade refuses to run if there's a subscription without a valid
+# remote_lsn.
+#
+# Replication origin's remote_lsn isn't set if no data is replicated after the
+# initial sync.
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+		'--preserve-subscription-state',
+		'--check',
+	],
+	'run of pg_upgrade --check for old instance with invalid remote_lsn');
+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ not removed after pg_upgrade failure");
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Make sure the replication origin is set
+insert_line('after initial sync');
+$old_sub->wait_for_subscription_sync($publisher, 'sub');
+
+my $result = $old_sub->safe_psql('postgres',
+    "SELECT COUNT(*) FROM pg_subscription_rel WHERE srsubstate != 'r'");
+is ($result, qq(0), "All tables in pg_subscription_rel should be in ready state");
+
+# Check the number of rows for each table on each server
+$result = $publisher->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(2), "Table t1 should have 2 rows on the publisher");
+$result = $publisher->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(2), "Table t2 should have 2 rows on the publisher");
+$result = $old_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(2), "Table t1 should have 2 rows on the old subscriber");
+$result = $old_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(0), "Table t2 should have 0 rows on the old subscriber");
+
+# Check that pg_upgrade refuses to run if there's a subscription with tables in
+# a state different than 'r' (ready).
+$old_sub->safe_psql('postgres',
+    "ALTER SUBSCRIPTION sub DISABLE");
+$old_sub->safe_psql('postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'i' WHERE srsubstate = 'r'");
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+		'--preserve-subscription-state',
+		'--check',
+	],
+	'run of pg_upgrade --check for old instance with incorrect sub rel');
+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ not removed after pg_upgrade failure");
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Check that pg_upgrade doesn't detect any problem once all the subscription's
+# relation are in 'r' (ready) state.
+$old_sub->safe_psql('postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'r' WHERE srsubstate = 'i'");
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+		'--preserve-subscription-state',
+		'--check',
+	],
+	'run of pg_upgrade --check for old instance with correct sub rel');
+
+# Stop the old subscriber, insert a row in each table while it's down and add
+# t2 to the publication
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+insert_line('while old_sub is down');
+
+# Run pg_upgrade
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+		'--preserve-subscription-state',
+	],
+	'run of pg_upgrade for new sub');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade success");
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub ADD TABLE t2");
+
+$new_sub->start;
+
+# Subscription relations and replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+    "SELECT count(*) FROM pg_subscription_rel");
+is ($result, qq(1), "There should be 1 row in pg_subscription_rel");
+
+$result = $new_sub->safe_psql('postgres',
+    "SELECT remote_lsn FROM pg_replication_origin_status");
+is ($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# There should be no new replicated rows before enabling the subscription
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(2), "Table t1 should still have 2 rows on the new subscriber");
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(0), "Table t2 should still have 0 rows on the new subscriber");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION sub ENABLE");
+
+$publisher->wait_for_catchup('sub');
+
+# Rows on t1 should have been replicated, while nothing should happen for t2
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(3), "Table t1 should now have 3 rows on the new subscriber");
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(0), "Table t2 should still have 0 rows on the new subscriber");
+
+# Refresh the subscription, only the missing row on t2 should be replicated
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'sub');
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t1");
+is ($result, qq(3), "Table t1 should still have 3 rows on the new subscriber");
+$result = $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM t2");
+is ($result, qq(3), "Table t2 should now have 3 rows on the new subscriber");
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9805bc6118..ac7ec5df31 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5488,6 +5488,13 @@
   proargmodes => '{i,o,o,o,o,o,o,o,o,o}',
   proargnames => '{subid,subid,relid,pid,leader_pid,received_lsn,last_msg_send_time,last_msg_receipt_time,latest_end_lsn,latest_end_time}',
   prosrc => 'pg_stat_get_subscription' },
+{ oid => '6108', descr => 'add a relation with the specified state to pg_subscription_rel table',
+  proname => 'binary_upgrade_create_sub_rel_state', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  proallargtypes => '{text,oid,char,pg_lsn}',
+  proargmodes => '{i,i,i,i}',
+  proargnames => '{subname,relid,state,sublsn}',
+  prosrc => 'binary_upgrade_create_sub_rel_state' },
 { oid => '2026', descr => 'statistics: current backend PID',
   proname => 'pg_backend_pid', provolatile => 's', proparallel => 'r',
   prorettype => 'int4', proargtypes => '', prosrc => 'pg_backend_pid' },
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 0656c94416..190ee73809 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2649,6 +2649,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#57Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#55)
Re: pg_upgrade and logical replication

On Mon, Sep 04, 2023 at 02:12:58PM +0530, Amit Kapila wrote:

Yeah, I agree that could be hacked quickly but note I haven't reviewed
in detail if there are other design issues in this patch. Note that we
thought first to support the upgrade of the publisher node, otherwise,
immediately after upgrading the subscriber and publisher, the
subscriptions won't work and start giving errors as they are dependent
on slots in the publisher. One other point that needs some thought is
that the LSN positions we are going to copy in the catalog may no
longer be valid after the upgrade (of the publisher) because we reset
WAL. Does that need some special consideration or are we okay with
that in all cases?

In pg_upgrade, copy_xact_xlog_xid() puts the new node ahead of the old
cluster by 8 segments on TLI 1, so how would be it a problem if the
subscribers keep a remote confirmed LSN lower than that in their
catalogs? (You've mentioned that to me offline, but I forgot the
details in the code.)

As of now, things are quite safe as documented in
pg_dump doc page that it will be the user's responsibility to set up
replication after dump/restore. I think it would be really helpful if
you could share your thoughts on the publisher-side matter as we are
facing a few tricky questions to be answered. For example, see a new
thread [1].

In my experience, users are quite used to upgrade standbys *first*,
even in simple scenarios like minor upgrades, because that's the only
way to do things safely. For example, updating and/or upgrading
primaries before the standbys could be a problem if an update
introduces a slight change in the WAL record format that could be
generated by the primary but not be processed by a standby, and we've
done such tweaks in some records in the past for some bug fixes that
had to be backpatched to stable branches.

IMO, the upgrade of subscriber nodes and the upgrade of publisher
nodes need to be treated as two independent processing problems, dealt
with separately.

As you have mentioned me earlier offline, these two have, from what I
understand. one dependency: during a publisher upgrade we need to make
sure that there are no invalid slots when beginning to run pg_upgrade,
and that the confirmed LSN of all the slots used by the subscribers
match with the shutdown checkpoint's LSN, ensuring that the
subscribers would not lose any data because everything's already been
consumed by them when the publisher gets to be upgraded.

The point raised by Jonathan for not having an option for pg_upgrade
is that it will be easy for users, otherwise, users always need to
enable this option. Consider a replication setup, wouldn't users want
by default it to be upgraded? Asking them to do that via an option
would be an inconvenience. So, that was the reason, we wanted to have
an --exclude option and by default allow slots to be upgraded. I think
the same theory applies here.

[1] - /messages/by-id/CAA4eK1LV3+76CSOAk0h8Kv0AKb-OETsJHe6Sq6172-7DZXf0Qg@mail.gmail.com

I saw this thread, and have some thoughts to share. Will reply there.
--
Michael

#58vignesh C
vignesh21@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#45)
1 attachment(s)
Re: pg_upgrade and logical replication

On Thu, 27 Apr 2023 at 13:18, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Julien,

Thank you for updating the patch! Followings are my comments.

01. documentation

In this page steps to upgrade server with pg_upgrade is aligned. Should we write
down about subscriber? IIUC, it is sufficient to just add to "Run pg_upgrade",
like "Apart from streaming replication standby, subscriber node can be upgrade
via pg_upgrade. At that time we strongly recommend to use --preserve-subscription-state".

Now this option has been removed and made default

02. AlterSubscription

I agreed that oid must be preserved between nodes, but I'm still afraid that
given oid is unconditionally trusted and added to pg_subscription_rel.
I think we can check the existenec of the relation via SearchSysCache1(RELOID,
ObjectIdGetDatum(relid)). Of cource the check is optional, so it should be
executed only when USE_ASSERT_CHECKING is on. Thought?

Modified

03. main

Currently --preserve-subscription-state and --no-subscriptions can be used
together, but the situation is quite unnatural. Shouldn't we exclude them?

This option is removed now, so this scenario will not happen

04. getSubscriptionTables

```
+ SubRelInfo *rels = NULL;
```

The variable is used only inside the loop, so the definition should be also moved.

This logic is changed slightly, so it needs to be kept outside

05. getSubscriptionTables

```
+ nrels = atooid(PQgetvalue(res, i, i_nrels));
```

atoi() should be used instead of atooid().

Modified

06. getSubscriptionTables

```
+                       subinfo = findSubscriptionByOid(cur_srsubid);
+
+                       nrels = atooid(PQgetvalue(res, i, i_nrels));
+                       rels = pg_malloc(nrels * sizeof(SubRelInfo));
+
+                       subinfo->subrels = rels;
+                       subinfo->nrels = nrels;
```

Maybe it never occurs, but findSubscriptionByOid() can return NULL. At that time
accesses to their attributes will lead the Segfault. Some handling is needed.

This should not happen, added a fatal error in this case.

07. dumpSubscription

Hmm, SubRelInfos are still dumped at the dumpSubscription(). I think this style
breaks the manner of pg_dump. I think another dump function is needed. Please
see dumpPublicationTable() and dumpPublicationNamespace(). If you have a reason
to use the style, some comments to describe it is needed.

Modified

08. _SubRelInfo

If you will address above comment, DumpableObject must be added as new attribute.

Modified

09. check_for_subscription_state

```
+                       for (int i = 0; i < ntup; i++)
+                       {
+                               is_error = true;
+                               pg_log(PG_WARNING,
+                                          "\nWARNING:  subscription \"%s\" has an invalid remote_lsn",
+                                          PQgetvalue(res, 0, 0));
+                       }
```

The second argument should be i to report the name of subscription more than 2.

Modified

10. 003_subscription.pl

```
$old_sub->wait_for_subscription_sync($publisher, 'sub');

my $result = $old_sub->safe_psql('postgres',
"SELECT COUNT(*) FROM pg_subscription_rel WHERE srsubstate != 'r'");
is ($result, qq(0), "All tables in pg_subscription_rel should be in ready state");
```

I think there is a possibility to cause a timing issue, because the SELECT may
be executed before srsubstate is changed from 's' to 'r'. Maybe poll_query_until()
can be used instead.

Modified

11. 003_subscription.pl

```
command_ok(
[
'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
'-D', $new_sub->data_dir, '-b', $bindir,
'-B', $bindir, '-s', $new_sub->host,
'-p', $old_sub->port, '-P', $new_sub->port,
$mode,
'--preserve-subscription-state',
'--check',
],
'run of pg_upgrade --check for old instance with correct sub rel');
```

Missing check of pg_upgrade_output.d?

Modified

And maybe you missed to run pgperltidy.

It has been run for the new patch.

The attached v7 patch has the changes for the same.

Regards,
Vignesh

Attachments:

v7-0001-Preserve-the-full-subscription-s-state-during-pg_.patchtext/x-patch; charset=US-ASCII; name=v7-0001-Preserve-the-full-subscription-s-state-during-pg_.patchDownload
From 4660c0914b8c3aef92461b84d7170ffc11bf5dd9 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Thu, 7 Sep 2023 11:37:36 +0530
Subject: [PATCH v7] Preserve the full subscription's state during pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

Similarly, the subscription's replication origin are needed to ensure
that we don't replicate anything twice.

To fix this problem, this patch teaches pg_dump in binary upgrade mode to
restore the content of pg_subscription_rel from the old cluster by using
binary_upgrade_create_sub_rel_state SQL function, and also provides an
additional LSN parameter for CREATE SUBSCRIPTION to restore the underlying
replication origin remote LSN.  The new binary_upgrade_create_sub_rel_state
SQL function and the new LSN parameter are not exposed to users and only
accepted in binary upgrade mode.

The new SQL binary_upgrade_create_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_create_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is optional, and
defaults to NULL/InvalidXLogRecPtr if not provided. pg_dump will retrieve these
values(subname, relid, state and sublsn) from the old cluster.

For now, pg_upgrade will check that all the subscription have a valid
replication origin remote_lsn, and that all underlying relations are in
'r' (ready) state, and will error out if that's not the case, logging the
reason for the failure.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml          |   7 +
 src/backend/catalog/pg_subscription.c    | 127 +++++++++++++
 src/bin/pg_dump/common.c                 |  22 +++
 src/bin/pg_dump/pg_dump.c                | 177 ++++++++++++++++-
 src/bin/pg_dump/pg_dump.h                |  18 +-
 src/bin/pg_dump/pg_dump_sort.c           |  11 +-
 src/bin/pg_upgrade/check.c               |  80 ++++++++
 src/bin/pg_upgrade/meson.build           |   1 +
 src/bin/pg_upgrade/t/003_subscription.pl | 230 +++++++++++++++++++++++
 src/include/catalog/pg_proc.dat          |  14 ++
 src/tools/pgindent/typedefs.list         |   1 +
 11 files changed, 683 insertions(+), 5 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/003_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index bea0d1b93f..cf9b6a4044 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -856,6 +856,13 @@ psql --username=postgres --file=script.sql postgres
    (<type>regclass</type>, <type>regrole</type>, and <type>regtype</type> can be upgraded.)
   </para>
 
+  <para>
+   For upgradation of the subscriptions, all the subscriptions on the old
+   cluster must have a valid <varname>remote_lsn</varname>, and all the
+   subscription tables should be in <literal>r</literal> (ready) state, or else
+   the <application>pg_upgrade</application> run will error.
+  </para>
+
   <para>
    If you want to use link mode and you do not want your old cluster
    to be modified when the new cluster is started, consider using the clone mode.
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d07f88ce28..64e26dc291 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -25,6 +25,8 @@
 #include "catalog/pg_type.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
@@ -269,6 +271,131 @@ AddSubscriptionRelState(Oid subid, Oid relid, char state,
 	table_close(rel, NoLock);
 }
 
+/*
+ * binary_upgrade_create_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * table.
+ */
+Datum
+binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	if (!IsBinaryUpgrade)
+		ereport(ERROR,
+				errcode(ERRCODE_SYNTAX_ERROR),
+				errmsg("binary_upgrade_create_sub_rel_state can only be called when server is in binary upgrade mode"));
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) ||
+		PG_ARGISNULL(1) ||
+		PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_create_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+
+	if (PG_ARGISNULL(3))
+		sublsn = InvalidXLogRecPtr;
+	else
+		sublsn = PG_GETARG_LSN(3);
+
+	if (!OidIsValid(relid))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid relation identifier used: %u", relid));
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_sub_replication_origin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_sub_replication_origin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	sublsn;
+	char		originname[NAMEDATALEN];
+	RepOriginId originid;
+
+	if (!IsBinaryUpgrade)
+		ereport(ERROR,
+				errcode(ERRCODE_SYNTAX_ERROR),
+				errmsg("binary_upgrade_sub_replication_origin_advance can only be called when server is in binary upgrade mode"));
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) ||
+		PG_ARGISNULL(1))
+		elog(ERROR, "null argument to binary_upgrade_sub_replication_origin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	sublsn = PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+	originid = replorigin_by_name(originname, false);
+	replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
 /*
  * Update the state of a subscription table.
  */
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f7b6176692..2abeb573e7 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4576,6 +4577,92 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  get information about subscription membership for dumpable tables.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	SubscriptionInfo *subinfo;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i;
+	int			cur_rel = 0;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (!fout->dopt->binary_upgrade || fout->remoteVersion < 100000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get subscription relation fields */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[cur_rel].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[cur_rel].dobj.catId.tableoid = relid;
+		subrinfo[cur_rel].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[cur_rel].dobj);
+		subrinfo[cur_rel].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[cur_rel].tblinfo = tblinfo;
+		subrinfo[cur_rel].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+		subrinfo[cur_rel].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[cur_rel].dobj), fout);
+
+		cur_rel++;
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4601,6 +4688,7 @@ getSubscriptions(Archive *fout)
 	int			i_subpublications;
 	int			i_subbinary;
 	int			i_subpasswordrequired;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4655,15 +4743,19 @@ getSubscriptions(Archive *fout)
 	if (fout->remoteVersion >= 160000)
 		appendPQExpBufferStr(query,
 							 " s.suborigin,\n"
-							 " s.subpasswordrequired\n");
+							 " s.subpasswordrequired,\n");
 	else
 		appendPQExpBuffer(query,
 						  " '%s' AS suborigin,\n"
-						  " 't' AS subpasswordrequired\n",
+						  " 't' AS subpasswordrequired,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	appendPQExpBufferStr(query, "o.remote_lsn\n");
+
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n"
+						 "LEFT JOIN pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4689,6 +4781,7 @@ getSubscriptions(Archive *fout)
 	i_subdisableonerr = PQfnumber(res, "subdisableonerr");
 	i_suborigin = PQfnumber(res, "suborigin");
 	i_subpasswordrequired = PQfnumber(res, "subpasswordrequired");
+	i_suboriginremotelsn = PQfnumber(res, "remote_lsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4721,6 +4814,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
 		subinfo[i].subpasswordrequired =
 			pg_strdup(PQgetvalue(res, i, i_subpasswordrequired));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4730,6 +4828,71 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  dump the definition of the given subscription table mapping
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_create_sub_rel_state will add the subscription
+		 * relation to pg_subscripion_rel table, this is supported only for
+		 * upgrade operation.
+		 */
+		if (fout->dopt->binary_upgrade && fout->remoteVersion >= 100000)
+		{
+			appendPQExpBuffer(query,
+							  "SELECT binary_upgrade_create_sub_rel_state('%s', %u, '%c'",
+							  subrinfo->dobj.name,
+							  subrinfo->tblinfo->dobj.catId.oid,
+							  subrinfo->srsubstate);
+
+			if (subrinfo->srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", '%s'",
+								  subrinfo->srsublsn);
+
+			appendPQExpBufferStr(query, ");\n");
+		}
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4807,6 +4970,12 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && subinfo->suboriginremotelsn)
+		appendPQExpBuffer(query,
+						  "SELECT binary_upgrade_sub_replication_origin_advance('%s', '%s');\n",
+						  subinfo->dobj.name,
+						  subinfo->suboriginremotelsn);
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10425,6 +10594,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18491,6 +18663,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 9036b13f6a..dd7ae15505 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -82,7 +82,8 @@ typedef enum
 	DO_PUBLICATION,
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
-	DO_SUBSCRIPTION
+	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL
 } DumpableObjectType;
 
 /*
@@ -670,8 +671,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *subpasswordrequired;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -696,6 +710,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -755,5 +770,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index 523a19c155..5bf1e47ee6 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -93,6 +93,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -146,10 +147,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1542,6 +1544,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d)",
+					 obj->dumpId);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 56e313f562..5bf4ba8aa1 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_state(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -104,6 +105,8 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
+	check_for_subscription_state(&old_cluster);
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -785,6 +788,83 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_state()
+ *
+ * Verify that all subscriptions have a valid remote_lsn and don't contain
+ * any table in srsubstate different than ready ('r').
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)
+{
+	int			dbnum;
+	bool		is_error = false;
+
+	/* PG 10 introduced subscriptions. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1000)
+		return;
+
+	prep_status("Checking for subscription state");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin_status only once. */
+		if (dbnum == 0)
+		{
+			int			ntup;
+
+			res = executeQueryOrDie(conn,
+									"SELECT s.subname, d.datname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT JOIN pg_catalog.pg_replication_origin_status os"
+									"  ON os.external_id = 'pg_' || s.oid "
+									"LEFT JOIN pg_catalog.pg_database d"
+									"  ON d.oid = s.subdbid "
+									"WHERE coalesce(remote_lsn, '0/0') = '0/0'");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				is_error = true;
+				pg_log(PG_WARNING,
+					   "\nWARNING: database \"%s\" has %s subscription with an invalid remote_lsn",
+					   PQgetvalue(res, i, 1),
+					   PQgetvalue(res, i, 0));
+			}
+			PQclear(res);
+		}
+
+		res = executeQueryOrDie(conn,
+								"SELECT count(0) "
+								"FROM pg_catalog.pg_subscription_rel "
+								"WHERE srsubstate != 'r'");
+
+		if (PQntuples(res) != 1)
+			pg_fatal("could not determine the number of non-ready subscription relations");
+
+		if (strcmp(PQgetvalue(res, 0, 0), "0") != 0)
+		{
+			is_error = true;
+			pg_log(PG_WARNING,
+				   "\nWARNING: database \"%s\" has %s subscription relations(s) in non-ready state",
+				   active_db->db_name,
+				   PQgetvalue(res, 0, 0));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (is_error)
+		pg_fatal("subscription(s) have an invalid remote_lsn or subscription relation(s) are not in ready state");
+
+	check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 12a97f84e2..9ea25dec70 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -42,6 +42,7 @@ tests += {
     'tests': [
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
+      't/003_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/t/003_subscription.pl b/src/bin/pg_upgrade/t/003_subscription.pl
new file mode 100644
index 0000000000..350c7971f0
--- /dev/null
+++ b/src/bin/pg_upgrade/t/003_subscription.pl
@@ -0,0 +1,230 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use Cwd qw(abs_path);
+use File::Basename qw(dirname);
+use File::Compare;
+use File::Find qw(find);
+use File::Path qw(rmtree);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::AdjustUpgrade;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $bindir = $new_sub->config_data('--bindir');
+
+sub insert_line
+{
+	my $payload = shift;
+
+	foreach ("t1", "t2")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("t1", "t2")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line('before initial sync');
+
+# Setup logical replication, replicating only 1 table
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR TABLE t1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$connstr' PUBLICATION pub");
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'sub');
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if there's a subscription without a
+# valid remote_lsn.
+# ------------------------------------------------------
+
+# Replication origin's remote_lsn isn't set if no data is replicated after the
+# initial sync.
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with invalid remote_lsn');
+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ not removed after pg_upgrade failure");
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Make sure the replication origin is set
+insert_line('after initial sync');
+$old_sub->wait_for_subscription_sync($publisher, 'sub');
+
+my $result = $old_sub->safe_psql('postgres',
+	"SELECT COUNT(*) FROM pg_subscription_rel WHERE srsubstate != 'r'");
+is($result, qq(0),
+	"All tables in pg_subscription_rel should be in ready state");
+
+# Ensure that relation has reached 'ready' state
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r');";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the number of rows for each table on each server
+$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2), "check initial t1 table data on publisher");
+$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(2), "check initial t1 table data on publisher");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2), "check initial t1 table data on the old subscriber");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0), "check initial t2 table data on the old subscriber");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if there's a subscription with tables in
+# a state different than 'r' (ready).
+# ------------------------------------------------------
+
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub DISABLE");
+
+# Set tables to 'i' state
+$old_sub->safe_psql(
+	'postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'i' WHERE srsubstate = 'r'");
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with incorrect sub rel');
+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ not removed after pg_upgrade failure");
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# ------------------------------------------------------
+# Check that pg_upgrade doesn't detect any problem once all the subscription's
+# relation are in 'r' (ready) state.
+# ------------------------------------------------------
+
+$old_sub->safe_psql(
+	'postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'r' WHERE srsubstate = 'i'");
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with correct sub rel');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade success");
+
+# ------------------------------------------------------
+# Check that after upgradation of the subscriber server, the incremental
+# changes added to the publisher are replicated.
+# ------------------------------------------------------
+
+# Stop the old subscriber, insert a row in each table while it's down and add
+# t2 to the publication
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+insert_line('while old_sub is down');
+
+# Run pg_upgrade
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+	],
+	'run of pg_upgrade for new sub');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade success");
+$publisher->safe_psql('postgres', "ALTER PUBLICATION pub ADD TABLE t2");
+
+$new_sub->start;
+
+# Subscription relations and replication origin remote_lsn should be preserved
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(1), "There should be 1 row in pg_subscription_rel");
+
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# There should be no new replicated rows before enabling the subscription
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2),
+	"t1 table has no new replicated rows before enabling the subscription");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in t2 table which is not part of the publication");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
+
+$publisher->wait_for_catchup('sub');
+
+# Rows on t1 should have been replicated, while nothing should happen for t2
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(3), "check replicated inserts on new subscriber");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in table t2 afer enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, only the missing row on t2 should be replicated
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'sub');
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(3),
+	"check there is no change when there was no changes replicated");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(3),
+	"check replicated inserts on new subscriber after refreshing");
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9805bc6118..1c1eeaa667 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5488,6 +5488,20 @@
   proargmodes => '{i,o,o,o,o,o,o,o,o,o}',
   proargnames => '{subid,subid,relid,pid,leader_pid,received_lsn,last_msg_send_time,last_msg_receipt_time,latest_end_lsn,latest_end_time}',
   prosrc => 'pg_stat_get_subscription' },
+{ oid => '6108', descr => 'add a relation with the specified relation state to pg_subscription_rel table',
+  proname => 'binary_upgrade_create_sub_rel_state', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  proallargtypes => '{text,oid,char,pg_lsn}',
+  proargmodes => '{i,i,i,i}',
+  proargnames => '{subname,relid,relstate,sublsn}',
+  prosrc => 'binary_upgrade_create_sub_rel_state' },
+{ oid => '6109', descr => 'update the remote_lsn for the subscriber\'s replication origin',
+  proname => 'binary_upgrade_sub_replication_origin_advance', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  proallargtypes => '{text,pg_lsn}',
+  proargmodes => '{i,i}',
+  proargnames => '{subname,sublsn}',
+  prosrc => 'binary_upgrade_sub_replication_origin_advance' },
 { oid => '2026', descr => 'statistics: current backend PID',
   proname => 'pg_backend_pid', provolatile => 's', proparallel => 'r',
   prorettype => 'int4', proargtypes => '', prosrc => 'pg_backend_pid' },
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index f2af84d7ca..ff03f2a830 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2650,6 +2650,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#59vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#46)
Re: pg_upgrade and logical replication

On Wed, 10 May 2023 at 13:29, Peter Smith <smithpb2250@gmail.com> wrote:

Here are some review comments for the v5-0001 patch code.

======
General

1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

I was a bit confused by this relation 'state' mentioned in multiple
places. IIUC the pg_upgrade logic is going to reject anything with a
non-READY (not 'r') state anyhow, so what is the point of having all
the extra grammar/parse_subscription_options etc to handle setting the
state when only possible value must be 'r'?

This command has been removed, this code has been removed

2. state V relstate

I still feel code readbility suffers a bit by calling some fields/vars
a generic 'state' instead of the more descriptive 'relstate'. Maybe
it's just me.

Previously commented same (see [1]#3, #4, #5)

Few of the code has been removed, I have modified wherever possible

======
doc/src/sgml/ref/pgupgrade.sgml

3.
+       <para>
+        Fully preserve the logical subscription state if any.  That includes
+        the underlying replication origin with their remote LSN and the list of
+        relations in each subscription so that replication can be simply
+        resumed if the subscriptions are reactivated.
+       </para>

I think the "if any" part is not necessary. If you remove those words,
then the rest of the sentence can be simplified.

SUGGESTION
Fully preserve the logical subscription state, which includes the
underlying replication origin's remote LSN, and the list of relations
in each subscription. This allows replication to simply resume when
the subscriptions are reactivated.

This has been removed now.

4.
+       <para>
+        If this option isn't used, it is up to the user to reactivate the
+        subscriptions in a suitable way; see the subscription part in <xref
+        linkend="pg-dump-notes"/> for more information.
+       </para>

The link still renders strangely as previously reported (see [1]#2b).

This has been removed now

5.
+       <para>
+        If this option is used and any of the subscription on the old cluster
+        has an unknown <varname>remote_lsn</varname> (0/0), or has any relation
+        in a state different from <literal>r</literal> (ready), the
+        <application>pg_upgrade</application> run will error.
+       </para>

5a.
/subscription/subscriptions/

Modified

5b
"has any relation in a state different from r" --> "has any relation
with state other than r"

Modified slightly

======
src/backend/commands/subscriptioncmds.c

6.
+ if (strlen(state_str) != 1)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid relation state: %s", state_str)));

Is this relation state validation overly simplistic, by only checking
for length 1? Shouldn't this just be asserting the relstate must be
'r'?

This code has been removed

======
src/bin/pg_dump/pg_dump.c

7. getSubscriptionTables

+/*
+ * getSubscriptionTables
+ *   get information about the given subscription's relations
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+ SubscriptionInfo *subinfo;
+ SubRelInfo *rels = NULL;
+ PQExpBuffer query;
+ PGresult   *res;
+ int i_srsubid;
+ int i_srrelid;
+ int i_srsubstate;
+ int i_srsublsn;
+ int i_nrels;
+ int i,
+ cur_rel = 0,
+ ntups,
+ last_srsubid = InvalidOid;

Why some above are single int declarations and some are compound int
declarations? Why not make them all consistent?

Modified

~

8.
+ appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn,"
+   " count(*) OVER (PARTITION BY srsubid) AS nrels"
+   " FROM pg_subscription_rel"
+   " ORDER BY srsubid");

Should this SQL be schema-qualified like pg_catalog.pg_subscription_rel?

Modified

~

9.
+ for (i = 0; i < ntups; i++)
+ {
+ int cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));

Should 'cur_srsubid' be declared Oid to match the atooid?

Modified

~~~

10. getSubscriptions

+ if (PQgetisnull(res, i, i_suboriginremotelsn))
+ subinfo[i].suboriginremotelsn = NULL;
+ else
+ subinfo[i].suboriginremotelsn =
+ pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+
+ /*
+ * For now assume there's no relation associated with the
+ * subscription. Later code might update this field and allocate
+ * subrels as needed.
+ */
+ subinfo[i].nrels = 0;

The wording "For now assume there's no" kind of gives an ambiguous
interpretation for this comment. IMO it sounds like this is the
"current" logic but some future PG version may behave differently - I
don't think that is the intended meaning at all.

SUGGESTION.
Here we just initialize nrels to say there are 0 relations associated
with the subscription. If necessary, subsequent logic will update this
field and allocate the subrels.

This part of logic has been removed now as it is no more required

~~~

11. dumpSubscription

+ for (i = 0; i < subinfo->nrels; i++)
+ {
+ appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
+   "(relid = %u, state = '%c'",
+   qsubname,
+   subinfo->subrels[i].srrelid,
+   subinfo->subrels[i].srsubstate);
+
+ if (subinfo->subrels[i].srsublsn[0] != '\0')
+ appendPQExpBuffer(query, ", LSN = '%s'",
+   subinfo->subrels[i].srsublsn);
+
+ appendPQExpBufferStr(query, ");");
+ }

I previously asked ([1]#11) about how can this ALTER SUBSCRIPTION
TABLE code happen unless 'preserve_subscriptions' is true, and you
confirmed "It indirectly is, as in that case subinfo->nrels is
guaranteed to be 0. I just tried to keep the code simpler and avoid
too many nested conditions."

I have added the same check used that is used to get the subscription
tables to avoid confusion.

~

If you are worried about too many nested conditions then a simple
Assert(dopt->preserve_subscriptions); might be good to have here.

======
src/bin/pg_upgrade/check.c

12. check_and_dump_old_cluster

+ /* PG 10 introduced subscriptions. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1000 &&
+ user_opts.preserve_subscriptions)
+ {
+ check_for_subscription_state(&old_cluster);
+ }

12a.
All the other checks in this function seem to be in decreasing order
of PG version so maybe this check should be moved to follow that same
pattern.

Modified

~

12b.
Also won't it be better to give some error or notice of some kind if
the option/version are incompatible? I think this was mentioned in a
previous review.

e.g.

if (user_opts.preserve_subscriptions)
{
if (GET_MAJOR_VERSION(old_cluster.major_version) < 1000)
<pg_log or pg_fatal goes here...>;
check_for_subscription_state(&old_cluster);
}

This has been removed now

~~~

13. check_for_subscription_state

+ for (int i = 0; i < ntup; i++)
+ {
+ is_error = true;
+ pg_log(PG_WARNING,
+    "\nWARNING:  subscription \"%s\" has an invalid remote_lsn",
+    PQgetvalue(res, 0, 0));
+ }

13a.
This WARNING does not mention the database, but a similar warning
later about the non-ready state does mention the database. Probably
they should be consistent.

Modified

~

13b.
Something seems amiss. Here the is_error is assigned true; But later
when you test is_error that is for logging the ready-state problem.
Isn't there another missing pg_fatal for this invalid remote_lsn case?

Modified

======
src/bin/pg_upgrade/option.c

14. usage

+ printf(_(" --preserve-subscription-state preserve the subscription
state fully\n"));

Why say "fully"? How is "preserve the subscription state fully"
different to "preserve the subscription state" from the user's POV?

This has been removed now

These are handled as part of v7 posted at [1]/messages/by-id/CALDaNm1ZrbHaWpJwwNhDTJocRKWd3rEkgJazuDdZ9Z-WdvonFg@mail.gmail.com.
[1]: /messages/by-id/CALDaNm1ZrbHaWpJwwNhDTJocRKWd3rEkgJazuDdZ9Z-WdvonFg@mail.gmail.com

Regards,
Vignesh

#60vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#54)
Re: pg_upgrade and logical replication

On Mon, 4 Sept 2023 at 13:26, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Sep 4, 2023 at 11:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jul 19, 2023 at 12:47 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, May 10, 2023 at 05:59:24PM +1000, Peter Smith wrote:

1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

I was a bit confused by this relation 'state' mentioned in multiple
places. IIUC the pg_upgrade logic is going to reject anything with a
non-READY (not 'r') state anyhow, so what is the point of having all
the extra grammar/parse_subscription_options etc to handle setting the
state when only possible value must be 'r'?

We are just talking about the handling of an extra DefElem in an
extensible grammar pattern, so adding the state field does not
represent much maintenance work. I'm OK with the addition of this
field in the data set dumped, FWIW, on the ground that it can be
useful for debugging purposes when looking at --binary-upgrade dumps,
and because we aim at copying catalog contents from one cluster to
another.

Anyway, I am not convinced that we have any need for a parse-able
grammar at all, because anything that's presented on this thread is
aimed at being used only for the internal purpose of an upgrade in a
--binary-upgrade dump with a direct catalog copy in mind, and having a
grammar would encourage abuses of it outside of this context. I think
that we should aim for simpler than what's proposed by the patch,
actually, with either a single SQL function à-la-binary_upgrade() that
adds the contents of a relation. Or we can be crazier and just create
INSERT queries for pg_subscription_rel to provide an exact copy of the
catalog contents. A SQL function would be more consistent with other
objects types that use similar tricks, see
binary_upgrade_create_empty_extension() that does something similar
for some pg_extension records. So, this function would require in
input 4 arguments:
- The subscription name or OID.
- The relation OID.
- Its LSN.
- Its sync state.

+1 for doing it via function (something like
binary_upgrade_create_sub_rel_state). We already have the internal
function AddSubscriptionRelState() that can do the core work.

Modified

One more related point:
@@ -4814,9 +4923,31 @@ dumpSubscription(Archive *fout, const
SubscriptionInfo *subinfo)
if (strcmp(subinfo->subpasswordrequired, "t") != 0)
appendPQExpBuffer(query, ", password_required = false");

+ if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+ subinfo->suboriginremotelsn)
+ {
+ appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
+ }

Even during Create Subscription, we can use an existing function
(pg_replication_origin_advance()) or a set of functions to advance the
origin instead of introducing a new option.

Added a function binary_upgrade_sub_replication_origin_advance which
will: a) check if the subscription exists, b) get the replication name
for subscription and c) advance the replication origin.

These are handled as part of v7 posted at [1]/messages/by-id/CALDaNm1ZrbHaWpJwwNhDTJocRKWd3rEkgJazuDdZ9Z-WdvonFg@mail.gmail.com.
[1]: /messages/by-id/CALDaNm1ZrbHaWpJwwNhDTJocRKWd3rEkgJazuDdZ9Z-WdvonFg@mail.gmail.com

Regards,
Vignesh

#61vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#47)
Re: pg_upgrade and logical replication

On Wed, 10 May 2023 at 13:39, Peter Smith <smithpb2250@gmail.com> wrote:

On Mon, Apr 24, 2023 at 4:19 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Hi,

On Thu, Apr 13, 2023 at 03:26:56PM +1000, Peter Smith wrote:

1.
All the comments look alike, so it is hard to know what is going on.
If each of the main test parts could be highlighted then the test code
would be easier to read IMO.

Something like below:
[...]

I added a bit more comments about what's is being tested. I'm not sure that a
big TEST CASE prefix is necessary, as it's not really multiple separated test
cases and other stuff can be tested in between. Also AFAICT no other TAP test
current needs this kind of banner, even if they're testing more complex
scenario.

Hmm, I think there are plenty of examples of subscription TAP tests
having some kind of highlighted comments as suggested, for better
readability.

e.g. See src/test/subscription
t/014_binary.pl
t/015_stream.pl
t/016_stream_subxact.pl
t/018_stream_subxact_abort.pl
t/021_twophase.pl
t/022_twophase_cascade.pl
t/023_twophase_stream.pl
t/028_row_filter.pl
t/030_origin.pl
t/031_column_list.pl
t/032_subscribe_use_index.pl

A simple #################### to separate the main test parts is all
that is needed.

Modified

4b.
All these messages like "Table t1 should still have 2 rows on the new
subscriber" don't seem very helpful. e.g. They are not saying anything
about WHAT this is testing or WHY it should still have 2 rows.

I don't think that those messages are supposed to say what or why something is
tested, just give a quick context / reference on the test in case it's broken.
The comments are there to explain in more details what is tested and/or why.

But, why can’t they do both? They can be a quick reference *and* at
the same time give some more meaning to the error log. Otherwise,
these messages might as well just say ‘ref1’, ‘ref2’, ‘ref3’...

Modified

These are handled as part of v7 posted at [1]/messages/by-id/CALDaNm1ZrbHaWpJwwNhDTJocRKWd3rEkgJazuDdZ9Z-WdvonFg@mail.gmail.com.
[1]: /messages/by-id/CALDaNm1ZrbHaWpJwwNhDTJocRKWd3rEkgJazuDdZ9Z-WdvonFg@mail.gmail.com

Regards,
Vignesh

#62Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: vignesh C (#58)
1 attachment(s)
RE: pg_upgrade and logical replication

Dear Vignesh,

Thank you for updating the patch! Here are some comments.

Sorry if there are duplicate comments - the thread revived recently so I might
lose my memory.

01. General

Is there a possibility that apply worker on old cluster connects to the
publisher during the upgrade? Regarding the pg_upgrade on publisher, the we
refuse TCP/IP connections from remotes and port number is also changed, so we can
assume that subscriber does not connect to. But IIUC such settings may not affect
to the connection source, so that the apply worker may try to connect to the
publisher. Also, is there any hazards if it happens?

02. Upgrade functions

Two functions - binary_upgrade_create_sub_rel_state and binary_upgrade_sub_replication_origin_advance
should be located at pg_upgrade_support.c. Also, CHECK_IS_BINARY_UPGRADE() macro
can be used.

03. Parameter combinations

IIUC getSubscriptionTables() should be exitted quickly if --no-subscriptions is
specified, whereas binary_upgrade_create_sub_rel_state() is failed.

04. I failed my test

I executed attached script but failed to upgrade:

```
Restoring database schemas in the new cluster
postgres
*failure*

Consult the last few lines of "data_N3/pg_upgrade_output.d/20230912T054546.320/log/pg_upgrade_dump_5.log" for
the probable cause of the failure.
Failure, exiting
```

I checked the log and found that binary_upgrade_create_sub_rel_state() does not
support skipping the fourth argument:

```
pg_restore: from TOC entry 4059; 16384 16387 SUBSCRIPTION TABLE sub sub postgres
pg_restore: error: could not execute query: ERROR: function binary_upgrade_create_sub_rel_state(unknown, integer, unknown) does not exist
LINE 1: SELECT binary_upgrade_create_sub_rel_state('sub', 16384, 'r'...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
Command was: SELECT binary_upgrade_create_sub_rel_state('sub', 16384, 'r');
```

IIUC if we allow to skip arguments, we must define wrappers like pg_copy_logical_replication_slot_*.
Another approach is that pg_dump always dumps srsublsn even if it is NULL.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachments:

test.shapplication/octet-stream; name=test.shDownload
#63Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: vignesh C (#58)
RE: pg_upgrade and logical replication

On Monday, September 11, 2023 6:32 PM vignesh C <vignesh21@gmail.com> wrote:

The attached v7 patch has the changes for the same.

Thanks for updating the patch, here are few comments:

1.

+/*
+ * binary_upgrade_sub_replication_origin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_sub_replication_origin_advance(PG_FUNCTION_ARGS)
+{

Is there any usage apart from pg_upgrade for this function, if not, I think
we'd better move this function to pg_upgrade_support.c. If yes, I think maybe
better to rename it to a general one.

2.

+ * Verify that all subscriptions have a valid remote_lsn and don't contain
+ * any table in srsubstate different than ready ('r').
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)

I think we'd better follow the same style of
check_for_isn_and_int8_passing_mismatch() to record the invalid things in a
file.

3.

+		if (fout->dopt->binary_upgrade && fout->remoteVersion >= 100000)
+		{
+			appendPQExpBuffer(query,
+							  "SELECT binary_upgrade_create_sub_rel_state('%s', %u, '%c'",
+							  subrinfo->dobj.name,

I think we'd better consider using appendStringLiteral or related function for
the dobj.name here to make sure the string convertion is safe.

4.

The following commit message may need update:
"binary_upgrade_create_sub_rel_state SQL function, and also provides an
additional LSN parameter for CREATE SUBSCRIPTION to restore the underlying
replication origin remote LSN. "

I think we have changed to another approach which doesn't provide new parameter
in DDL.

5. 
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));

Since we don't modify the tuple here, SearchSysCache2 seems enough.

6. 
+									"LEFT JOIN pg_catalog.pg_database d"
+									"  ON d.oid = s.subdbid "
+									"WHERE coalesce(remote_lsn, '0/0') = '0/0'");

For the subscriptions that were just created and finished the table sync but
haven't applied any changes, their remote_lsn will also be 0/0. Do we
need to report ERROR in this case ?

Best Regards,
Hou zj

#64Michael Paquier
michael@paquier.xyz
In reply to: vignesh C (#60)
Re: pg_upgrade and logical replication

On Mon, Sep 11, 2023 at 05:19:27PM +0530, vignesh C wrote:

Added a function binary_upgrade_sub_replication_origin_advance which
will: a) check if the subscription exists, b) get the replication name
for subscription and c) advance the replication origin.

These are handled as part of v7 posted at [1].
[1] - /messages/by-id/CALDaNm1ZrbHaWpJwwNhDTJocRKWd3rEkgJazuDdZ9Z-WdvonFg@mail.gmail.com

Thanks. I can see that some of the others have already provided
comments about this version. I have some comments on top of that.
--
Michael

#65Michael Paquier
michael@paquier.xyz
In reply to: Zhijie Hou (Fujitsu) (#63)
Re: pg_upgrade and logical replication

On Tue, Sep 12, 2023 at 01:22:50PM +0000, Zhijie Hou (Fujitsu) wrote:

+/*
+ * binary_upgrade_sub_replication_origin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_sub_replication_origin_advance(PG_FUNCTION_ARGS)
+{

Is there any usage apart from pg_upgrade for this function, if not, I think
we'd better move this function to pg_upgrade_support.c. If yes, I think maybe
better to rename it to a general one.

I was equally surprised by the choice of the patch regarding the
location of these functions, so I agree with your point that these
functions should be in pg_upgrade_support.c. All the sub-routines
these two functions rely on are defined in some headers already, so
there seem to be nothing new required for pg_upgrade_support.c.
--
Michael

#66vignesh C
vignesh21@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#62)
2 attachment(s)
Re: pg_upgrade and logical replication

On Tue, 12 Sept 2023 at 14:25, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Vignesh,

Thank you for updating the patch! Here are some comments.

Sorry if there are duplicate comments - the thread revived recently so I might
lose my memory.

01. General

Is there a possibility that apply worker on old cluster connects to the
publisher during the upgrade? Regarding the pg_upgrade on publisher, the we
refuse TCP/IP connections from remotes and port number is also changed, so we can
assume that subscriber does not connect to. But IIUC such settings may not affect
to the connection source, so that the apply worker may try to connect to the
publisher. Also, is there any hazards if it happens?

Yes, there is a possibility that the apply worker gets started and new
transaction data is being synced from the publisher. I have made a fix
not to start the launcher process in binary ugprade mode as we don't
want the launcher to start apply worker during upgrade.

02. Upgrade functions

Two functions - binary_upgrade_create_sub_rel_state and binary_upgrade_sub_replication_origin_advance
should be located at pg_upgrade_support.c. Also, CHECK_IS_BINARY_UPGRADE() macro
can be used.

Modified

03. Parameter combinations

IIUC getSubscriptionTables() should be exitted quickly if --no-subscriptions is
specified, whereas binary_upgrade_create_sub_rel_state() is failed.

Modified

04. I failed my test

I executed attached script but failed to upgrade:

```
Restoring database schemas in the new cluster
postgres
*failure*

Consult the last few lines of "data_N3/pg_upgrade_output.d/20230912T054546.320/log/pg_upgrade_dump_5.log" for
the probable cause of the failure.
Failure, exiting
```

I checked the log and found that binary_upgrade_create_sub_rel_state() does not
support skipping the fourth argument:

```
pg_restore: from TOC entry 4059; 16384 16387 SUBSCRIPTION TABLE sub sub postgres
pg_restore: error: could not execute query: ERROR: function binary_upgrade_create_sub_rel_state(unknown, integer, unknown) does not exist
LINE 1: SELECT binary_upgrade_create_sub_rel_state('sub', 16384, 'r'...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
Command was: SELECT binary_upgrade_create_sub_rel_state('sub', 16384, 'r');
```

IIUC if we allow to skip arguments, we must define wrappers like pg_copy_logical_replication_slot_*.
Another approach is that pg_dump always dumps srsublsn even if it is NULL.

Modified

The attached v8 version patch has the changes for the same.

Regards,
Vignesh

Attachments:

v8-0001-Don-t-start-launcher-process-in-binary-upgrade-mo.patchtext/x-patch; charset=US-ASCII; name=v8-0001-Don-t-start-launcher-process-in-binary-upgrade-mo.patchDownload
From 84dff766b98381dafd7382b7521107a9f76608dd Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Thu, 14 Sep 2023 09:59:46 +0530
Subject: [PATCH v8 1/3] Don't start launcher process in binary upgrade mode.

We don't want launcher to run in binary upgrade mode because
launcher might start apply worker which will start receiving changes
from the publisher and update the old cluster before the upgradation
is completed.
---
 src/backend/replication/logical/launcher.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 7882fc91ce..ec8cfd40ef 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -925,6 +925,14 @@ ApplyLauncherRegister(void)
 {
 	BackgroundWorker bgw;
 
+	/*
+	 * We don't want launcher to run in binary upgrade mode because
+	 * launcher might start apply worker which will start receiving changes
+	 * from the publisher before the physical files are put in place.
+	 */
+	if (IsBinaryUpgrade)
+		return;
+
 	if (max_logical_replication_workers == 0)
 		return;
 
-- 
2.34.1

v8-0002-Preserve-the-full-subscription-s-state-during-pg_.patchtext/x-patch; charset=US-ASCII; name=v8-0002-Preserve-the-full-subscription-s-state-during-pg_.patchDownload
From 1d0a182cbfbab749b267e2400b17e99f88bd5571 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Thu, 7 Sep 2023 11:37:36 +0530
Subject: [PATCH v8 2/3] Preserve the full subscription's state during
 pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_create_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_create_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_create_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin are needed to ensure
that we don't replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription have a valid replication origin
remote_lsn, and that all underlying relations are in 'r' (ready) state, and
will error out if that's not the case, logging the reason for the failure.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |   7 +
 src/backend/catalog/pg_subscription.c      |   2 +
 src/backend/utils/adt/pg_upgrade_support.c | 140 +++++++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 183 +++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  18 +-
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 |  99 +++++++++
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/t/003_subscription.pl   | 230 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  21 ++
 src/tools/pgindent/typedefs.list           |   1 +
 12 files changed, 730 insertions(+), 5 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/003_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index bea0d1b93f..cf9b6a4044 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -856,6 +856,13 @@ psql --username=postgres --file=script.sql postgres
    (<type>regclass</type>, <type>regrole</type>, and <type>regtype</type> can be upgraded.)
   </para>
 
+  <para>
+   For upgradation of the subscriptions, all the subscriptions on the old
+   cluster must have a valid <varname>remote_lsn</varname>, and all the
+   subscription tables should be in <literal>r</literal> (ready) state, or else
+   the <application>pg_upgrade</application> run will error.
+  </para>
+
   <para>
    If you want to use link mode and you do not want your old cluster
    to be modified when the new cluster is started, consider using the clone mode.
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d07f88ce28..6c98f85fb1 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -25,6 +25,8 @@
 #include "catalog/pg_type.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 0186636d9f..cd14bca2d0 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,14 +11,20 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -29,6 +35,8 @@ do {															\
 				 errmsg("function can only be called when server is in binary upgrade mode"))); \
 } while (0)
 
+static Datum binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS);
+
 Datum
 binary_upgrade_set_next_pg_tablespace_oid(PG_FUNCTION_ARGS)
 {
@@ -261,3 +269,135 @@ binary_upgrade_set_missing_value(PG_FUNCTION_ARGS)
 
 	PG_RETURN_VOID();
 }
+
+/*
+ * binary_upgrade_create_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * table.
+ */
+static Datum
+binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) ||
+		PG_ARGISNULL(1) ||
+		PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_create_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+
+	if (PG_ARGISNULL(3))
+		sublsn = InvalidXLogRecPtr;
+	else
+		sublsn = PG_GETARG_LSN(3);
+
+	if (!OidIsValid(relid))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid relation identifier used: %u", relid));
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/* The wrappers below are all to appease opr_sanity */
+Datum
+binary_upgrade_create_sub_rel_state_a(PG_FUNCTION_ARGS)
+{
+	return binary_upgrade_create_sub_rel_state(fcinfo);
+}
+
+Datum
+binary_upgrade_create_sub_rel_state_b(PG_FUNCTION_ARGS)
+{
+	return binary_upgrade_create_sub_rel_state(fcinfo);
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	sublsn;
+	char		originname[NAMEDATALEN];
+	RepOriginId originid;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) ||
+		PG_ARGISNULL(1))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	sublsn = PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+	originid = replorigin_by_name(originname, false);
+	replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f7b6176692..7beefe0307 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4576,6 +4577,94 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  get information about subscription membership for dumpable tables.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i;
+	int			cur_rel = 0;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 100000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get subscription relation fields */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[cur_rel].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[cur_rel].dobj.catId.tableoid = relid;
+		subrinfo[cur_rel].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[cur_rel].dobj);
+		subrinfo[cur_rel].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[cur_rel].tblinfo = tblinfo;
+		subrinfo[cur_rel].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+		subrinfo[cur_rel].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[cur_rel].dobj), fout);
+
+		cur_rel++;
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4601,6 +4690,7 @@ getSubscriptions(Archive *fout)
 	int			i_subpublications;
 	int			i_subbinary;
 	int			i_subpasswordrequired;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4655,15 +4745,19 @@ getSubscriptions(Archive *fout)
 	if (fout->remoteVersion >= 160000)
 		appendPQExpBufferStr(query,
 							 " s.suborigin,\n"
-							 " s.subpasswordrequired\n");
+							 " s.subpasswordrequired,\n");
 	else
 		appendPQExpBuffer(query,
 						  " '%s' AS suborigin,\n"
-						  " 't' AS subpasswordrequired\n",
+						  " 't' AS subpasswordrequired,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	appendPQExpBufferStr(query, "o.remote_lsn\n");
+
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n"
+						 "LEFT JOIN pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4689,6 +4783,7 @@ getSubscriptions(Archive *fout)
 	i_subdisableonerr = PQfnumber(res, "subdisableonerr");
 	i_suborigin = PQfnumber(res, "suborigin");
 	i_subpasswordrequired = PQfnumber(res, "subpasswordrequired");
+	i_suboriginremotelsn = PQfnumber(res, "remote_lsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4721,6 +4816,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
 		subinfo[i].subpasswordrequired =
 			pg_strdup(PQgetvalue(res, i, i_subpasswordrequired));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4730,6 +4830,73 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  dump the definition of the given subscription table mapping
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_create_sub_rel_state will add the subscription
+		 * relation to pg_subscripion_rel table, this is supported only for
+		 * upgrade operation.
+		 */
+		if (fout->dopt->binary_upgrade && fout->remoteVersion >= 100000)
+		{
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_create_sub_rel_state(");
+			appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+			appendPQExpBuffer(query,
+							  ", %u, '%c'",
+							  subrinfo->tblinfo->dobj.catId.oid,
+							  subrinfo->srsubstate);
+
+			if (subrinfo->srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", '%s'",
+								  subrinfo->srsublsn);
+
+			appendPQExpBufferStr(query, ");\n");
+		}
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4807,6 +4974,14 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10425,6 +10600,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18491,6 +18669,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 9036b13f6a..dd7ae15505 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -82,7 +82,8 @@ typedef enum
 	DO_PUBLICATION,
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
-	DO_SUBSCRIPTION
+	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL
 } DumpableObjectType;
 
 /*
@@ -670,8 +671,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *subpasswordrequired;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -696,6 +710,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -755,5 +770,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index 523a19c155..5bf1e47ee6 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -93,6 +93,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -146,10 +147,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1542,6 +1544,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d)",
+					 obj->dumpId);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 56e313f562..54afad7359 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_state(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -104,6 +105,8 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
+	check_for_subscription_state(&old_cluster);
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -785,6 +788,102 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_state()
+ *
+ * Verify that all subscriptions have a valid remote_lsn and don't contain
+ * any table in srsubstate different than ready ('r').
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	/* PG 10 introduced subscriptions. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1000)
+		return;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subscription_state.txt");
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin_status only once. */
+		if (dbnum == 0)
+		{
+			res = executeQueryOrDie(conn,
+									"SELECT s.subname, d.datname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT JOIN pg_catalog.pg_replication_origin_status os"
+									"  ON os.external_id = 'pg_' || s.oid "
+									"LEFT JOIN pg_catalog.pg_database d"
+									"  ON d.oid = s.subdbid "
+									"WHERE coalesce(remote_lsn, '0/0') = '0/0'"
+									"ORDER BY d.datname");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "database:%s has subscription:%s with an invalid remote_lsn\n",
+						PQgetvalue(res, i, 1),
+						PQgetvalue(res, i, 0));
+			}
+			PQclear(res);
+		}
+
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, c.relname "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"WHERE srsubstate != 'r' "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+							output_path, strerror(errno));
+
+			fprintf(script, "database:%s subscription:%s relation:%s in non-ready state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscription(s) with\n"
+				 "invalid remote_lsn or subscription relation(s) not in ready state.\n"
+				 "A list of subscription having invalid remote_lsn and/or\n"
+				 "subscription relation(s) not in ready state is in the file: %s",
+				 output_path);
+	}
+	else
+		check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 12a97f84e2..9ea25dec70 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -42,6 +42,7 @@ tests += {
     'tests': [
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
+      't/003_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/t/003_subscription.pl b/src/bin/pg_upgrade/t/003_subscription.pl
new file mode 100644
index 0000000000..350c7971f0
--- /dev/null
+++ b/src/bin/pg_upgrade/t/003_subscription.pl
@@ -0,0 +1,230 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use Cwd qw(abs_path);
+use File::Basename qw(dirname);
+use File::Compare;
+use File::Find qw(find);
+use File::Path qw(rmtree);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::AdjustUpgrade;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $bindir = $new_sub->config_data('--bindir');
+
+sub insert_line
+{
+	my $payload = shift;
+
+	foreach ("t1", "t2")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("t1", "t2")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line('before initial sync');
+
+# Setup logical replication, replicating only 1 table
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR TABLE t1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$connstr' PUBLICATION pub");
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'sub');
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if there's a subscription without a
+# valid remote_lsn.
+# ------------------------------------------------------
+
+# Replication origin's remote_lsn isn't set if no data is replicated after the
+# initial sync.
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with invalid remote_lsn');
+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ not removed after pg_upgrade failure");
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Make sure the replication origin is set
+insert_line('after initial sync');
+$old_sub->wait_for_subscription_sync($publisher, 'sub');
+
+my $result = $old_sub->safe_psql('postgres',
+	"SELECT COUNT(*) FROM pg_subscription_rel WHERE srsubstate != 'r'");
+is($result, qq(0),
+	"All tables in pg_subscription_rel should be in ready state");
+
+# Ensure that relation has reached 'ready' state
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r');";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the number of rows for each table on each server
+$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2), "check initial t1 table data on publisher");
+$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(2), "check initial t1 table data on publisher");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2), "check initial t1 table data on the old subscriber");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0), "check initial t2 table data on the old subscriber");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if there's a subscription with tables in
+# a state different than 'r' (ready).
+# ------------------------------------------------------
+
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub DISABLE");
+
+# Set tables to 'i' state
+$old_sub->safe_psql(
+	'postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'i' WHERE srsubstate = 'r'");
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with incorrect sub rel');
+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ not removed after pg_upgrade failure");
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# ------------------------------------------------------
+# Check that pg_upgrade doesn't detect any problem once all the subscription's
+# relation are in 'r' (ready) state.
+# ------------------------------------------------------
+
+$old_sub->safe_psql(
+	'postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'r' WHERE srsubstate = 'i'");
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with correct sub rel');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade success");
+
+# ------------------------------------------------------
+# Check that after upgradation of the subscriber server, the incremental
+# changes added to the publisher are replicated.
+# ------------------------------------------------------
+
+# Stop the old subscriber, insert a row in each table while it's down and add
+# t2 to the publication
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+insert_line('while old_sub is down');
+
+# Run pg_upgrade
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+	],
+	'run of pg_upgrade for new sub');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade success");
+$publisher->safe_psql('postgres', "ALTER PUBLICATION pub ADD TABLE t2");
+
+$new_sub->start;
+
+# Subscription relations and replication origin remote_lsn should be preserved
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(1), "There should be 1 row in pg_subscription_rel");
+
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# There should be no new replicated rows before enabling the subscription
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2),
+	"t1 table has no new replicated rows before enabling the subscription");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in t2 table which is not part of the publication");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
+
+$publisher->wait_for_catchup('sub');
+
+# Rows on t1 should have been replicated, while nothing should happen for t2
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(3), "check replicated inserts on new subscriber");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in table t2 afer enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, only the missing row on t2 should be replicated
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'sub');
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(3),
+	"check there is no change when there was no changes replicated");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(3),
+	"check replicated inserts on new subscriber after refreshing");
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9805bc6118..67eb6302c6 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11370,6 +11370,27 @@
   proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
   proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
   prosrc => 'binary_upgrade_set_next_pg_tablespace_oid' },
+{ oid => '4551', descr => 'add a relation with the specified relation state to pg_subscription_rel table',
+  proname => 'binary_upgrade_create_sub_rel_state', prorettype => 'void',
+  proargtypes => 'text oid char',
+  proallargtypes => '{text,oid,char}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{subname,relid,relstate}',
+  prosrc => 'binary_upgrade_create_sub_rel_state_a' },
+{ oid => '4552', descr => 'add a relation with the specified relation state to pg_subscription_rel table',
+  proname => 'binary_upgrade_create_sub_rel_state', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  proallargtypes => '{text,oid,char,pg_lsn}',
+  proargmodes => '{i,i,i,i}',
+  proargnames => '{subname,relid,relstate,sublsn}',
+  prosrc => 'binary_upgrade_create_sub_rel_state_b' },
+{ oid => '4553', descr => 'update the remote_lsn for the subscriber\'s replication origin',
+  proname => 'binary_upgrade_replorigin_advance', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  proallargtypes => '{text,pg_lsn}',
+  proargmodes => '{i,i}',
+  proargnames => '{subname,sublsn}',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index f3d8a2a855..21cc0e6c35 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2651,6 +2651,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#67vignesh C
vignesh21@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#63)
Re: pg_upgrade and logical replication

On Tue, 12 Sept 2023 at 18:52, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

On Monday, September 11, 2023 6:32 PM vignesh C <vignesh21@gmail.com> wrote:

The attached v7 patch has the changes for the same.

Thanks for updating the patch, here are few comments:

1.

+/*
+ * binary_upgrade_sub_replication_origin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_sub_replication_origin_advance(PG_FUNCTION_ARGS)
+{

Is there any usage apart from pg_upgrade for this function, if not, I think
we'd better move this function to pg_upgrade_support.c. If yes, I think maybe
better to rename it to a general one.

Moved to pg_upgrade_support.c and renamed to binary_upgrade_replorigin_advance

2.

+ * Verify that all subscriptions have a valid remote_lsn and don't contain
+ * any table in srsubstate different than ready ('r').
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)

I think we'd better follow the same style of
check_for_isn_and_int8_passing_mismatch() to record the invalid things in a
file.

Modfied

3.

+               if (fout->dopt->binary_upgrade && fout->remoteVersion >= 100000)
+               {
+                       appendPQExpBuffer(query,
+                                                         "SELECT binary_upgrade_create_sub_rel_state('%s', %u, '%c'",
+                                                         subrinfo->dobj.name,

I think we'd better consider using appendStringLiteral or related function for
the dobj.name here to make sure the string convertion is safe.

Modified

4.

The following commit message may need update:
"binary_upgrade_create_sub_rel_state SQL function, and also provides an
additional LSN parameter for CREATE SUBSCRIPTION to restore the underlying
replication origin remote LSN. "

I think we have changed to another approach which doesn't provide new parameter
in DDL.

Modified

5.
+       /* Fetch the existing tuple. */
+       tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+                                                         CStringGetDatum(subname));

Since we don't modify the tuple here, SearchSysCache2 seems enough.

6.
+                                                                       "LEFT JOIN pg_catalog.pg_database d"
+                                                                       "  ON d.oid = s.subdbid "
+                                                                       "WHERE coalesce(remote_lsn, '0/0') = '0/0'");

For the subscriptions that were just created and finished the table sync but
haven't applied any changes, their remote_lsn will also be 0/0. Do we
need to report ERROR in this case ?

I will handle this in the next version.

Thanks for the comments, the v8 patch attached at [1]/messages/by-id/CALDaNm1JzqTreCUrhNu5E1gq7Q8r_u3+FrisyT7moOED=UdoCg@mail.gmail.com has the changes
for the same.
[1]: /messages/by-id/CALDaNm1JzqTreCUrhNu5E1gq7Q8r_u3+FrisyT7moOED=UdoCg@mail.gmail.com

Regards,
Vignesh

#68vignesh C
vignesh21@gmail.com
In reply to: vignesh C (#66)
Re: pg_upgrade and logical replication

On Fri, 15 Sept 2023 at 15:08, vignesh C <vignesh21@gmail.com> wrote:

On Tue, 12 Sept 2023 at 14:25, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Vignesh,

Thank you for updating the patch! Here are some comments.

Sorry if there are duplicate comments - the thread revived recently so I might
lose my memory.

01. General

Is there a possibility that apply worker on old cluster connects to the
publisher during the upgrade? Regarding the pg_upgrade on publisher, the we
refuse TCP/IP connections from remotes and port number is also changed, so we can
assume that subscriber does not connect to. But IIUC such settings may not affect
to the connection source, so that the apply worker may try to connect to the
publisher. Also, is there any hazards if it happens?

Yes, there is a possibility that the apply worker gets started and new
transaction data is being synced from the publisher. I have made a fix
not to start the launcher process in binary ugprade mode as we don't
want the launcher to start apply worker during upgrade.

Another approach to solve this as suggested by one of my colleague
Hou-san would be to set max_logical_replication_workers = 0 while
upgrading. I will evaluate this and update the next version of patch
accordingly.

Regards,
Vignesh

#69Michael Paquier
michael@paquier.xyz
In reply to: vignesh C (#68)
Re: pg_upgrade and logical replication

On Fri, Sep 15, 2023 at 04:51:57PM +0530, vignesh C wrote:

Another approach to solve this as suggested by one of my colleague
Hou-san would be to set max_logical_replication_workers = 0 while
upgrading. I will evaluate this and update the next version of patch
accordingly.

In the context of an upgrade, any node started is isolated with its
own port and a custom unix domain directory with connections allowed
only through this one.

Saying that, I don't see why forcing max_logical_replication_workers
to be 0 would be necessarily a bad thing to prevent unnecessary
activity on the backend. This should be a separate patch built on
top of the main one, IMO.

Looking forward to seeing the rebased version you've mentioned, btw ;)
--
Michael

#70vignesh C
vignesh21@gmail.com
In reply to: Michael Paquier (#69)
1 attachment(s)
Re: pg_upgrade and logical replication

On Tue, 19 Sept 2023 at 11:49, Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Sep 15, 2023 at 04:51:57PM +0530, vignesh C wrote:

Another approach to solve this as suggested by one of my colleague
Hou-san would be to set max_logical_replication_workers = 0 while
upgrading. I will evaluate this and update the next version of patch
accordingly.

In the context of an upgrade, any node started is isolated with its
own port and a custom unix domain directory with connections allowed
only through this one.

Saying that, I don't see why forcing max_logical_replication_workers
to be 0 would be necessarily a bad thing to prevent unnecessary
activity on the backend. This should be a separate patch built on
top of the main one, IMO.

Here is a patch to set max_logical_replication_workers as 0 while the
server is started to prevent the launcher from being started. Since
this configuration is present from v10, no need for any version check.
I have done upgrade tests for v10-master, v11-master, ... v16-master
and found it to be working fine.

Regards,
Vignesh

Attachments:

v1-0001-Don-t-start-launcher-process-to-be-started-while-.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Don-t-start-launcher-process-to-be-started-while-.patchDownload
From f8ce16694ac4324113d948482818047f5457f924 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Tue, 19 Sep 2023 17:41:07 +0530
Subject: [PATCH v1] Don't start launcher process to be started while
 upgrading.

We don't want launcher to run while upgrading because launcher might start
apply worker which will start receiving changes from the publisher and
update the old cluster before the upgradation is completed.
---
 src/bin/pg_upgrade/server.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 0bc3d2806b..fa728d2b79 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -233,15 +233,16 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	 * Turn off durability requirements to improve object creation speed, and
 	 * we only modify the new cluster, so only use it there.  If there is a
 	 * crash, the new cluster has to be recreated anyway.  fsync=off is a big
-	 * win on ext4.
+	 * win on ext4. max_logical_replication_workers=0 to disable launcher.
 	 */
 	snprintf(cmd, sizeof(cmd),
-			 "\"%s/pg_ctl\" -w -l \"%s/%s\" -D \"%s\" -o \"-p %d -b%s %s%s\" start",
+			 "\"%s/pg_ctl\" -w -l \"%s/%s\" -D \"%s\" -o \"-p %d -b%s %s%s%s\" start",
 			 cluster->bindir,
 			 log_opts.logdir,
 			 SERVER_LOG_FILE, cluster->pgconfig, cluster->port,
 			 (cluster == &new_cluster) ?
 			 " -c synchronous_commit=off -c fsync=off -c full_page_writes=off" : "",
+			 " -c max_logical_replication_workers=0",
 			 cluster->pgopts ? cluster->pgopts : "", socket_string);
 
 	/*
-- 
2.34.1

#71Michael Paquier
michael@paquier.xyz
In reply to: vignesh C (#70)
Re: pg_upgrade and logical replication

On Tue, Sep 19, 2023 at 07:14:49PM +0530, vignesh C wrote:

Here is a patch to set max_logical_replication_workers as 0 while the
server is started to prevent the launcher from being started. Since
this configuration is present from v10, no need for any version check.
I have done upgrade tests for v10-master, v11-master, ... v16-master
and found it to be working fine.

The project policy is to support pg_upgrade for 10 years, and 9.6 was
released in 2016:
https://www.postgresql.org/docs/9.6/release-9-6.html

snprintf(cmd, sizeof(cmd),
-             "\"%s/pg_ctl\" -w -l \"%s/%s\" -D \"%s\" -o \"-p %d -b%s %s%s\" start",
+             "\"%s/pg_ctl\" -w -l \"%s/%s\" -D \"%s\" -o \"-p %d -b%s %s%s%s\" start",
cluster->bindir,
log_opts.logdir,
SERVER_LOG_FILE, cluster->pgconfig, cluster->port,
(cluster == &new_cluster) ?
" -c synchronous_commit=off -c fsync=off -c full_page_writes=off" : "",
+             " -c max_logical_replication_workers=0",
cluster->pgopts ? cluster->pgopts : "", socket_string);

/*

And this code path is used to start postmaster instances for old and
new clusters. So it seems to me that it is incorrect if this is not
conditional based on the cluster version.
--
Michael

#72Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#66)
Re: pg_upgrade and logical replication

On Fri, Sep 15, 2023 at 3:08 PM vignesh C <vignesh21@gmail.com> wrote:

The attached v8 version patch has the changes for the same.

Is the check to ensure remote_lsn is valid correct in function
check_for_subscription_state()? How about the case where the apply
worker didn't receive any change but just marked the relation as
'ready'?

Also, the patch seems to be allowing subscription relations from PG

=10 to be migrated but how will that work if the corresponding

publisher is also upgraded without slots? Won't the corresponding
workers start failing as soon as you restart the upgrade server? Do we
need to document the steps for users?

--
With Regards,
Amit Kapila.

#73Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#72)
Re: pg_upgrade and logical replication

On Wed, Sep 20, 2023 at 04:54:36PM +0530, Amit Kapila wrote:

Also, the patch seems to be allowing subscription relations from PG

=10 to be migrated but how will that work if the corresponding

publisher is also upgraded without slots? Won't the corresponding
workers start failing as soon as you restart the upgrade server? Do we
need to document the steps for users?

Hmm? How is that related to the upgrade of the subscribers? And how
is that different from the case where a subscriber tries to connect
back to a publisher where a slot has been dropped? There is no need
of pg_upgrade to reach such a state:
ERROR: could not start WAL streaming: ERROR: replication slot "popo" does not exist
--
Michael

#74Michael Paquier
michael@paquier.xyz
In reply to: vignesh C (#66)
Re: pg_upgrade and logical replication

On Fri, Sep 15, 2023 at 03:08:21PM +0530, vignesh C wrote:

On Tue, 12 Sept 2023 at 14:25, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Is there a possibility that apply worker on old cluster connects to the
publisher during the upgrade? Regarding the pg_upgrade on publisher, the we
refuse TCP/IP connections from remotes and port number is also changed, so we can
assume that subscriber does not connect to. But IIUC such settings may not affect
to the connection source, so that the apply worker may try to connect to the
publisher. Also, is there any hazards if it happens?

Yes, there is a possibility that the apply worker gets started and new
transaction data is being synced from the publisher. I have made a fix
not to start the launcher process in binary ugprade mode as we don't
want the launcher to start apply worker during upgrade.

Hmm. I was wondering if 0001 is the right way to handle this case,
but at the end I'm OK to paint one extra isBinaryUpgrade in the code
path where apply launchers are registered. I don't think that the
patch is complete, though. A comment should be added in pg_upgrade's
server.c, exactly start_postmaster(), to tell that -b also stops apply
workers. I am attaching a version updated as of the attached, that
I'd be OK to apply.

I don't really think that we need to worry about a subscriber
connecting back to a publisher in this case, though? I mean, each
postmaster instance started by pg_upgrade restricts the access to the
instance with unix_socket_directories set to a custom path and
permissions at 0700, and a subscription's connection string does not
know the unix path used by pg_upgrade. I certainly agree that
stopping these processes could lead to inconsistencies in the data the
subscribers have been holding though, if we are not careful, so
preventing them from running is a good practice anyway.

I have also reviewed 0002. As a whole, I think that I'm OK with the
main approach of the patch in pg_dump to use a new type of dumpable
object for subscription relations that are dumped with their upgrade
functions after. This still needs more work, and more documentation.
Also, perhaps we should really have an option to control if this part
of the copy happens or not. With a --no-subscription-relations for
pg_dump at least?

+{ oid => '4551', descr => 'add a relation with the specified relation state to pg_subscription_rel table',

During a development cycle, any new function added needs to use an OID
in range 8000-9999. Running unused_oids will suggest new random OIDs.

FWIW, I am not convinced that there is a need for two functions to add
an entry to pg_subscription_rel, with sole difference between both the
handling of a valid or invalid LSN. We should have only one function
that's able to handle NULL for the LSN. So let's remove rel_state_a
and rel_state_b, and have a single rel_state(). The description of
the SQL functions is inconsistent with the other binary upgrade ones,
I would suggest for the two functions:
"for use by pg_upgrade (relation for pg_subscription_rel)"
"for use by pg_upgrade (remote_lsn for origin)"

+   i_srsublsn = PQfnumber(res, "srsublsn");
[...]
+       subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));

In getSubscriptionTables(), this should check for PQgetisnull()
because we would have a NULL value for InvalidXLogRecPtr in the
catalog. Using a char* for srsublsn is OK, but just assign NULL to
it, then just pass a hardcoded NULL value to the function as we do in
other places. So I don't quite get why this is not the same handling
as suboriginremotelsn.

getSubscriptionTables() is entirely skipped if we don't want any
subscriptions, if we deal with a server of 9.6 or older or if we don't
do binary upgrades, which is OK.

+/*
+ * getSubscriptionTables
+ *	  get information about subscription membership for dumpable tables.
+ */
This commit is slightly misleading and should mention that this is an
upgrade-only path?

The code for dumpSubscriptionTable() is a copy-paste of
dumpPublicationTable(), but a lot of what you are doing here is
actually pointless if we are not in binary mode? Why should this code
path not taken only under dataOnly? I mean, this is a code path we
should never take except if we are in binary mode. This should have
at least a cross-check to make sure that we never have a
DO_SUBSCRIPTION_REL in this code path if we are in non-binary mode.

+    if (dopt->binary_upgrade && subinfo->suboriginremotelsn)
+    {
+        appendPQExpBufferStr(query,
+                             "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+        appendStringLiteralAH(query, subinfo->dobj.name, fout);
+        appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+    }

Hmm.. Could it be actually useful even for debugging to still have
this query if suboriginremotelsn is an InvalidXLogRecPtr? I think
that this should have a comment of the kind "\n-- For binary upgrade,
blah". At least it would not be a bad thing to enforce a correct
state from the start, removing the NULL check for the second argument
in binary_upgrade_replorigin_advance().

+ /* We need to check for pg_replication_origin_status only once. */
Perhaps it would be better to explain why?

+ "WHERE coalesce(remote_lsn, '0/0') = '0/0'"
Why a COALESCE here? Cannot this stuff just use NULL?

+ fprintf(script, "database:%s subscription:%s relation:%s in non-ready state\n",
Could it be possible to include the schema of the relation in this log?

+static void check_for_subscription_state(ClusterInfo *cluster);
I'd be tempted to move that into a patch on its own, actually, for a
cleaner history.

+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
New as of 2023.

+# Check that after upgradation of the subscriber server, the incremental
+# changes added to the publisher are replicated.
[..]
+   For upgradation of the subscriptions, all the subscriptions on the old
+   cluster must have a valid <varname>remote_lsn</varname>, and all the

Upgradation? I think that this should be reworded:
"All the subscriptions of an old cluster require a valid remote_lsn
during an upgrade."

A CI run is reporting the following compilation warnings:
[04:21:15.290] pg_dump.c: In function ‘getSubscriptionTables’:
[04:21:15.290] pg_dump.c:4655:29: error: ‘subinfo’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
[04:21:15.290] 4655 | subrinfo[cur_rel].subinfo = subinfo;

+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ not removed after pg_upgrade failure");
Not sure that there's a need for this check.  Okay, that's cheap.

And, err. We are going to need an option to control if the slot data
is copied, and a bit more documentation in pg_upgrade to explain how
things happen when the copy happens.
--
Michael

#75Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#72)
Re: pg_upgrade and logical replication

On Wed, Sep 20, 2023 at 04:54:36PM +0530, Amit Kapila wrote:

Is the check to ensure remote_lsn is valid correct in function
check_for_subscription_state()? How about the case where the apply
worker didn't receive any change but just marked the relation as
'ready'?

I may be missing, of course, but a relation is switched to
SUBREL_STATE_READY only once a sync happened and its state was
SUBREL_STATE_SYNCDONE, implying that SubscriptionRelState->lsn is
never InvalidXLogRecPtr, no?

For instance, nothing happens when a
Assert(!XLogRecPtrIsInvalid(rstate->lsn)) is added in
process_syncing_tables_for_apply().
--
Michael

#76Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#75)
Re: pg_upgrade and logical replication

On Thu, Sep 21, 2023 at 11:37 AM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Sep 20, 2023 at 04:54:36PM +0530, Amit Kapila wrote:

Is the check to ensure remote_lsn is valid correct in function
check_for_subscription_state()? How about the case where the apply
worker didn't receive any change but just marked the relation as
'ready'?

I may be missing, of course, but a relation is switched to
SUBREL_STATE_READY only once a sync happened and its state was
SUBREL_STATE_SYNCDONE, implying that SubscriptionRelState->lsn is
never InvalidXLogRecPtr, no?

The check in the patch is about the logical replication worker's
origin's LSN. The value of SubscriptionRelState->lsn won't matter for
the check.

--
With Regards,
Amit Kapila.

#77Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#73)
Re: pg_upgrade and logical replication

On Thu, Sep 21, 2023 at 4:39 AM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Sep 20, 2023 at 04:54:36PM +0530, Amit Kapila wrote:

Also, the patch seems to be allowing subscription relations from PG

=10 to be migrated but how will that work if the corresponding

publisher is also upgraded without slots? Won't the corresponding
workers start failing as soon as you restart the upgrade server? Do we
need to document the steps for users?

Hmm? How is that related to the upgrade of the subscribers?

It is because after upgrade of both publisher and subscriber, the
subscriptions won't work. Both publisher and subscriber should work,
otherwise, the logical replication set up won't work. I think we can
probably do this, if we can document clearly how the user can make
their logical replication set up work after upgrade.

And how
is that different from the case where a subscriber tries to connect
back to a publisher where a slot has been dropped?

It is different because we don't drop slots automatically anywhere else.

--
With Regards,
Amit Kapila.

#78Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#77)
Re: pg_upgrade and logical replication

On Thu, Sep 21, 2023 at 02:35:55PM +0530, Amit Kapila wrote:

It is because after upgrade of both publisher and subscriber, the
subscriptions won't work. Both publisher and subscriber should work,
otherwise, the logical replication set up won't work. I think we can
probably do this, if we can document clearly how the user can make
their logical replication set up work after upgrade.

Yeah, well, this comes back to my original point that the upgrade of
publisher nodes and subscriber nodes should be treated as two
different problems or we're mixing apples and oranges (and a node
could have both subscriber and publishers). While being able to
support both is a must, it is going to be a two-step process at the
end, with the subscribers done first and the publishers done after.
That's also kind of the point that Julien makes in top message of this
thread.

I agree that docs are lacking in the proposed patch in terms of
restrictions, assumptions and process flow, but taken in isolation the
problem of the publishers is not something that this patch has to take
care of. I'd certainly agree that it should mention, at least and if
merged first, to be careful if upgrading the publishers as its slots
are currently removed.
--
Michael

#79Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#71)
1 attachment(s)
Re: pg_upgrade and logical replication

On Wed, Sep 20, 2023 at 09:38:56AM +0900, Michael Paquier wrote:

And this code path is used to start postmaster instances for old and
new clusters. So it seems to me that it is incorrect if this is not
conditional based on the cluster version.

Avoiding the startup of bgworkers during pg_upgrade is something that
worries me a bit, actually, as it could be useful in some cases like
monitoring? That would be fancy, for sure.. For now and seeing a
lack of consensus on this larger matter, I'd like to propose a check
for IsBinaryUpgrade into ApplyLauncherRegister() instead as it makes
no real sense to start apply workers in this context. That would be
equivalent to max_logical_replication_workers = 0.

Amit, Vignesh, would the attached be OK for both of you?

(Vignesh has posted a slightly different version of this patch on a
different thread, but the subscriber part should be part of this
thread with the subscribers, I assume.)
--
Michael

Attachments:

0001-Prevent-startup-of-logical-replication-launcher-in-b.patchtext/x-diff; charset=us-asciiDownload
From 2df408695163eb46bfe7efa9a9ccc07ff5fab183 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Mon, 25 Sep 2023 10:54:59 +0900
Subject: [PATCH] Prevent startup of logical replication launcher in binary
 upgrade mode

The logical replication launcher may start apply workers during an
upgrade, which could be the cause of corruptions on a new cluster if
these are able to apply changes before the physical files are copied
over.

The chance of being able to do so should be small as pg_upgrade uses its
own port and unix domain directory (customizable as well with
--socketdir), but just preventing the launcher to start is safer at the
end, because we are then sure that no changes would ever be applied.

Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm2g9ZKf=y8X6z6MsLCuh8WwU-=Q6pLj35NFi2M5BZNS_A@mail.gmail.com
---
 src/backend/replication/logical/launcher.c | 9 +++++++++
 src/bin/pg_upgrade/server.c                | 2 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 7882fc91ce..9c610edbeb 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -925,6 +925,15 @@ ApplyLauncherRegister(void)
 {
 	BackgroundWorker bgw;
 
+	/*
+	 * We don't want the launcher to run in binary upgrade mode because it may
+	 * start apply workers which could start receiving changes from the
+	 * publisher before the physical files are put in place, causing
+	 * corruption on the new cluster upgrading to.
+	 */
+	if (IsBinaryUpgrade)
+		return;
+
 	if (max_logical_replication_workers == 0)
 		return;
 
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 0bc3d2806b..edbc101269 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -228,7 +228,7 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 #endif
 
 	/*
-	 * Use -b to disable autovacuum.
+	 * Use -b to disable autovacuum and logical replication launcher.
 	 *
 	 * Turn off durability requirements to improve object creation speed, and
 	 * we only modify the new cluster, so only use it there.  If there is a
-- 
2.40.1

#80Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#78)
Re: pg_upgrade and logical replication

On Fri, Sep 22, 2023 at 4:36 AM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Sep 21, 2023 at 02:35:55PM +0530, Amit Kapila wrote:

It is because after upgrade of both publisher and subscriber, the
subscriptions won't work. Both publisher and subscriber should work,
otherwise, the logical replication set up won't work. I think we can
probably do this, if we can document clearly how the user can make
their logical replication set up work after upgrade.

Yeah, well, this comes back to my original point that the upgrade of
publisher nodes and subscriber nodes should be treated as two
different problems or we're mixing apples and oranges (and a node
could have both subscriber and publishers). While being able to
support both is a must, it is going to be a two-step process at the
end, with the subscribers done first and the publishers done after.
That's also kind of the point that Julien makes in top message of this
thread.

I agree that docs are lacking in the proposed patch in terms of
restrictions, assumptions and process flow, but taken in isolation the
problem of the publishers is not something that this patch has to take
care of.

I also don't think that this patch has to solve the problem of
publishers in any way but as per my understanding, if due to some
reason we are not able to do the upgrade of publishers, this can add
more steps for users than they have to do now for logical replication
set up after upgrade. This is because now after restoring the
subscription rel's and origin, as soon as we start replication after
creating the slots on the publisher, we will never be able to
guarantee data consistency. So, they need to drop the entire
subscription setup including truncating the relations, and then set it
up from scratch which also means they need to somehow remember or take
a dump of the current subscription setup. According to me, the key
point is to have a mechanism to set up slots correctly to allow
replication (or subscriptions) to work after the upgrade. Without
that, it appears to me that we are restoring a subscription where it
can start from some random LSN and can easily lead to data consistency
issues where it can miss some of the updates.

This is the primary reason why I prioritized to work on the publisher
side before getting this patch done, otherwise, the solution for this
patch was relatively clear. I am not sure but I guess this could be
the reason why originally we left it in the current state, otherwise,
restoring subscription rel's or origin doesn't seem to be too much of
an additional effort than what we are doing now.

--
With Regards,
Amit Kapila.

#81Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Michael Paquier (#79)
RE: pg_upgrade and logical replication

Dear Michael,

I'd like to propose a check
for IsBinaryUpgrade into ApplyLauncherRegister() instead as it makes
no real sense to start apply workers in this context. That would be
equivalent to max_logical_replication_workers = 0.

Personally, I prefer to change max_logical_replication_workers. Mainly there are
two reasons:

1. Your approach must be back-patched to older versions which support logical
replication feature, but the oldest one (PG10) has already been unsupported.
We should not modify such a branch.
2. Also, "max_logical_replication_workers = 0" approach would be consistent
with what we are doing now and for upgrade of publisher patch.
Please see the previous discussion [1]/messages/by-id/CAA4eK1+WBphnmvMpjrxceymzuoMuyV2_pMGaJq-zNODiJqAa7Q@mail.gmail.com.

[1]: /messages/by-id/CAA4eK1+WBphnmvMpjrxceymzuoMuyV2_pMGaJq-zNODiJqAa7Q@mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

#82Michael Paquier
michael@paquier.xyz
In reply to: Hayato Kuroda (Fujitsu) (#81)
Re: pg_upgrade and logical replication

On Mon, Sep 25, 2023 at 05:35:18AM +0000, Hayato Kuroda (Fujitsu) wrote:

Personally, I prefer to change max_logical_replication_workers. Mainly there are
two reasons:

1. Your approach must be back-patched to older versions which support logical
replication feature, but the oldest one (PG10) has already been unsupported.
We should not modify such a branch.

This suggestion would be only for HEAD as it changes the behavior of -b.

2. Also, "max_logical_replication_workers = 0" approach would be consistent
with what we are doing now and for upgrade of publisher patch.
Please see the previous discussion [1].

Yeah, you're right. Consistency would be good across the board, and
we'd need to take care of the old clusters as well, so the GUC
enforcement would be needed as well. It does not strike me that this
extra IsBinaryUpgrade would hurt anyway? Forcing the hand of the
backend has the merit of allowing the removal of the tweak with
max_logical_replication_workers at some point in the future.
--
Michael

#83Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#80)
Re: pg_upgrade and logical replication

On Mon, Sep 25, 2023 at 10:05:41AM +0530, Amit Kapila wrote:

I also don't think that this patch has to solve the problem of
publishers in any way but as per my understanding, if due to some
reason we are not able to do the upgrade of publishers, this can add
more steps for users than they have to do now for logical replication
set up after upgrade. This is because now after restoring the
subscription rel's and origin, as soon as we start replication after
creating the slots on the publisher, we will never be able to
guarantee data consistency. So, they need to drop the entire
subscription setup including truncating the relations, and then set it
up from scratch which also means they need to somehow remember or take
a dump of the current subscription setup. According to me, the key
point is to have a mechanism to set up slots correctly to allow
replication (or subscriptions) to work after the upgrade. Without
that, it appears to me that we are restoring a subscription where it
can start from some random LSN and can easily lead to data consistency
issues where it can miss some of the updates.

Sure, that's assuming that the publisher side is upgraded. FWIW, my
take is that there's room to move forward with this patch anyway in
favor of cases like rollover upgrades to the subscriber.

This is the primary reason why I prioritized to work on the publisher
side before getting this patch done, otherwise, the solution for this
patch was relatively clear. I am not sure but I guess this could be
the reason why originally we left it in the current state, otherwise,
restoring subscription rel's or origin doesn't seem to be too much of
an additional effort than what we are doing now.

By "additional effort", you are referring to what the patch is doing,
with the binary dump of pg_subscription_rel, right?
--
Michael

#84Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#83)
Re: pg_upgrade and logical replication

On Mon, Sep 25, 2023 at 11:43 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Sep 25, 2023 at 10:05:41AM +0530, Amit Kapila wrote:

I also don't think that this patch has to solve the problem of
publishers in any way but as per my understanding, if due to some
reason we are not able to do the upgrade of publishers, this can add
more steps for users than they have to do now for logical replication
set up after upgrade. This is because now after restoring the
subscription rel's and origin, as soon as we start replication after
creating the slots on the publisher, we will never be able to
guarantee data consistency. So, they need to drop the entire
subscription setup including truncating the relations, and then set it
up from scratch which also means they need to somehow remember or take
a dump of the current subscription setup. According to me, the key
point is to have a mechanism to set up slots correctly to allow
replication (or subscriptions) to work after the upgrade. Without
that, it appears to me that we are restoring a subscription where it
can start from some random LSN and can easily lead to data consistency
issues where it can miss some of the updates.

Sure, that's assuming that the publisher side is upgraded.

At some point, user needs to upgrade publisher and subscriber could
itself have some publications defined which means the downstream
subscribers will have the same problem.

FWIW, my
take is that there's room to move forward with this patch anyway in
favor of cases like rollover upgrades to the subscriber.

This is the primary reason why I prioritized to work on the publisher
side before getting this patch done, otherwise, the solution for this
patch was relatively clear. I am not sure but I guess this could be
the reason why originally we left it in the current state, otherwise,
restoring subscription rel's or origin doesn't seem to be too much of
an additional effort than what we are doing now.

By "additional effort", you are referring to what the patch is doing,
with the binary dump of pg_subscription_rel, right?

Yes.

--
With Regards,
Amit Kapila.

#85vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#72)
Re: pg_upgrade and logical replication

On Wed, 20 Sept 2023 at 16:54, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Sep 15, 2023 at 3:08 PM vignesh C <vignesh21@gmail.com> wrote:

The attached v8 version patch has the changes for the same.

Is the check to ensure remote_lsn is valid correct in function
check_for_subscription_state()? How about the case where the apply
worker didn't receive any change but just marked the relation as
'ready'?

I agree that remote_lsn will not be valid in the case when all the
tables are in ready state and there are no changes to be sent by the
walsender to the worker. I was not sure if this check is required in
this case in the check_for_subscription_state function. I was thinking
that this check could be removed.
I'm also checking why the tables should only be in ready state, the
check that is there in the same function, can we support upgrades when
the tables are in syncdone state or not. I will post my analysis once
I have finished checking on the same.

Regards,
Vignesh

#86Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Michael Paquier (#82)
RE: pg_upgrade and logical replication

Dear Michael,

1. Your approach must be back-patched to older versions which support logical
replication feature, but the oldest one (PG10) has already been

unsupported.

We should not modify such a branch.

This suggestion would be only for HEAD as it changes the behavior of -b.

2. Also, "max_logical_replication_workers = 0" approach would be consistent
with what we are doing now and for upgrade of publisher patch.
Please see the previous discussion [1].

Yeah, you're right. Consistency would be good across the board, and
we'd need to take care of the old clusters as well, so the GUC
enforcement would be needed as well. It does not strike me that this
extra IsBinaryUpgrade would hurt anyway? Forcing the hand of the
backend has the merit of allowing the removal of the tweak with
max_logical_replication_workers at some point in the future.

Hmm, our initial motivation is to suppress registering the launcher, and adding
a GUC setting is sufficient for it. Indeed, registering a launcher may be harmful,
but it seems not the goal of this thread (changing -b workflow in HEAD is not
sufficient alone for the issue). I'm not sure it should be included in patch sets
here.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

#87Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#84)
Re: pg_upgrade and logical replication

On Tue, Sep 26, 2023 at 09:40:48AM +0530, Amit Kapila wrote:

On Mon, Sep 25, 2023 at 11:43 AM Michael Paquier <michael@paquier.xyz> wrote:

Sure, that's assuming that the publisher side is upgraded.

At some point, user needs to upgrade publisher and subscriber could
itself have some publications defined which means the downstream
subscribers will have the same problem.

Not always. I take it as a valid case that one may want to create a
logical setup only for the sake of an upgrade, and trashes the
publisher after a failover to an upgraded subscriber node after the
latter has done a sync up of the data that's been added to the
relations tracked by the publications while the subscriber was
pg_upgrade'd.

This is the primary reason why I prioritized to work on the publisher
side before getting this patch done, otherwise, the solution for this
patch was relatively clear. I am not sure but I guess this could be
the reason why originally we left it in the current state, otherwise,
restoring subscription rel's or origin doesn't seem to be too much of
an additional effort than what we are doing now.

By "additional effort", you are referring to what the patch is doing,
with the binary dump of pg_subscription_rel, right?

Yes.

Okay. I'd like to move on with this stuff, then. At least it helps
in maintaining data integrity when doing an upgrade with a logical
setup. The patch still needs more polishing, though..
--
Michael

#88vignesh C
vignesh21@gmail.com
In reply to: vignesh C (#85)
Re: pg_upgrade and logical replication

On Tue, 26 Sept 2023 at 10:58, vignesh C <vignesh21@gmail.com> wrote:

On Wed, 20 Sept 2023 at 16:54, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Sep 15, 2023 at 3:08 PM vignesh C <vignesh21@gmail.com> wrote:

The attached v8 version patch has the changes for the same.

Is the check to ensure remote_lsn is valid correct in function
check_for_subscription_state()? How about the case where the apply
worker didn't receive any change but just marked the relation as
'ready'?

I agree that remote_lsn will not be valid in the case when all the
tables are in ready state and there are no changes to be sent by the
walsender to the worker. I was not sure if this check is required in
this case in the check_for_subscription_state function. I was thinking
that this check could be removed.
I'm also checking why the tables should only be in ready state, the
check that is there in the same function, can we support upgrades when
the tables are in syncdone state or not. I will post my analysis once
I have finished checking on the same.

Once the table is in SUBREL_STATE_SYNCDONE state, the apply worker
will check if the apply worker has some LSN records that need to be
applied to reach the LSN of the table. Once the required WAL is
applied, the table state will be changed from SUBREL_STATE_SYNCDONE to
SUBREL_STATE_READY state. Since there is a chance that in this case
the apply worker has to apply some transactions to get all the tables
in READY state, I felt the minimum requirement should be that at least
all the tables should be in READY state for the upgradation of the
subscriber.

Regards,
Vignesh

#89Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#88)
Re: pg_upgrade and logical replication

On Wed, Sep 27, 2023 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, 26 Sept 2023 at 10:58, vignesh C <vignesh21@gmail.com> wrote:

On Wed, 20 Sept 2023 at 16:54, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Sep 15, 2023 at 3:08 PM vignesh C <vignesh21@gmail.com> wrote:

The attached v8 version patch has the changes for the same.

Is the check to ensure remote_lsn is valid correct in function
check_for_subscription_state()? How about the case where the apply
worker didn't receive any change but just marked the relation as
'ready'?

I agree that remote_lsn will not be valid in the case when all the
tables are in ready state and there are no changes to be sent by the
walsender to the worker. I was not sure if this check is required in
this case in the check_for_subscription_state function. I was thinking
that this check could be removed.
I'm also checking why the tables should only be in ready state, the
check that is there in the same function, can we support upgrades when
the tables are in syncdone state or not. I will post my analysis once
I have finished checking on the same.

Once the table is in SUBREL_STATE_SYNCDONE state, the apply worker
will check if the apply worker has some LSN records that need to be
applied to reach the LSN of the table. Once the required WAL is
applied, the table state will be changed from SUBREL_STATE_SYNCDONE to
SUBREL_STATE_READY state. Since there is a chance that in this case
the apply worker has to apply some transactions to get all the tables
in READY state, I felt the minimum requirement should be that at least
all the tables should be in READY state for the upgradation of the
subscriber.

I don't think this theory is completely correct because the pending
WAL can be applied even after an upgrade.

--
With Regards,
Amit Kapila.

#90vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#80)
Re: pg_upgrade and logical replication

On Mon, 25 Sept 2023 at 10:05, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Sep 22, 2023 at 4:36 AM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Sep 21, 2023 at 02:35:55PM +0530, Amit Kapila wrote:

It is because after upgrade of both publisher and subscriber, the
subscriptions won't work. Both publisher and subscriber should work,
otherwise, the logical replication set up won't work. I think we can
probably do this, if we can document clearly how the user can make
their logical replication set up work after upgrade.

Yeah, well, this comes back to my original point that the upgrade of
publisher nodes and subscriber nodes should be treated as two
different problems or we're mixing apples and oranges (and a node
could have both subscriber and publishers). While being able to
support both is a must, it is going to be a two-step process at the
end, with the subscribers done first and the publishers done after.
That's also kind of the point that Julien makes in top message of this
thread.

I agree that docs are lacking in the proposed patch in terms of
restrictions, assumptions and process flow, but taken in isolation the
problem of the publishers is not something that this patch has to take
care of.

I also don't think that this patch has to solve the problem of
publishers in any way but as per my understanding, if due to some
reason we are not able to do the upgrade of publishers, this can add
more steps for users than they have to do now for logical replication
set up after upgrade. This is because now after restoring the
subscription rel's and origin, as soon as we start replication after
creating the slots on the publisher, we will never be able to
guarantee data consistency. So, they need to drop the entire
subscription setup including truncating the relations, and then set it
up from scratch which also means they need to somehow remember or take
a dump of the current subscription setup. According to me, the key
point is to have a mechanism to set up slots correctly to allow
replication (or subscriptions) to work after the upgrade. Without
that, it appears to me that we are restoring a subscription where it
can start from some random LSN and can easily lead to data consistency
issues where it can miss some of the updates.

This is the primary reason why I prioritized to work on the publisher
side before getting this patch done, otherwise, the solution for this
patch was relatively clear. I am not sure but I guess this could be
the reason why originally we left it in the current state, otherwise,
restoring subscription rel's or origin doesn't seem to be too much of
an additional effort than what we are doing now.

I have tried to analyze the steps for upgrading the subscriber with
HEAD and with the upgrade patches, Here are the steps for the same:
Current steps to upgrade subscriber in HEAD:
1) Upgrade the subscriber server
2) Start subscriber server
3) truncate the tables
4) Alter the subscriptions to point to new slots in the subscriber
5) Enable the subscriptions
6) Alter subscription to refresh the publications

Steps to upgrade If we commit only the subscriber upgrade patch:
1) Upgrade the subscriber server
2) Start subscriber server
3) truncate the tables
Note: We will have to drop the subscriptions as we have made changes
to the pg_subscription_rel
4) But drop subscription will throw error:
postgres=# DROP SUBSCRIPTION test1 cascade;
ERROR: could not drop replication slot "test1" on publisher: ERROR:
replication slot "test1" does not exist
5) Alter the subscription to set slot_name to none
6) Make a note of all the subscriptions that are present
7) drop the subscriptions
8) Create the subscriptions

The number of steps will increase in this case.

Steps to upgrade If we commit publisher upgrade patch first and then
the subscriber upgrade patch patch:
1) Upgrade the subscriber server
2) Start subscriber server
3) Enable the subscription
4) Alter subscription to refresh the publications

Based on the above, I also feel it is better to get the upgrade
publisher patch committed first, as a) it will reduce the data copying
time(as truncate is not required) b) the number of steps will reduce
c) all the use cases will be handled.

Regards,
Vignesh

#91Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#89)
Re: pg_upgrade and logical replication

On Wed, Sep 27, 2023 at 07:31:41PM +0530, Amit Kapila wrote:

On Wed, Sep 27, 2023 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote:

Once the table is in SUBREL_STATE_SYNCDONE state, the apply worker
will check if the apply worker has some LSN records that need to be
applied to reach the LSN of the table. Once the required WAL is
applied, the table state will be changed from SUBREL_STATE_SYNCDONE to
SUBREL_STATE_READY state. Since there is a chance that in this case
the apply worker has to apply some transactions to get all the tables
in READY state, I felt the minimum requirement should be that at least
all the tables should be in READY state for the upgradation of the
Subscriber.

I don't think this theory is completely correct because the pending
WAL can be applied even after an upgrade.

Yeah, agreed that putting a pre-check about the state of the relations
stored in pg_subscription_rel when handling the upgrade of a
subscriber is not necessary.
--
Michael

#92Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#87)
Re: pg_upgrade and logical replication

On Wed, Sep 27, 2023 at 9:14 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Sep 26, 2023 at 09:40:48AM +0530, Amit Kapila wrote:

On Mon, Sep 25, 2023 at 11:43 AM Michael Paquier <michael@paquier.xyz> wrote:

Sure, that's assuming that the publisher side is upgraded.

At some point, user needs to upgrade publisher and subscriber could
itself have some publications defined which means the downstream
subscribers will have the same problem.

Not always. I take it as a valid case that one may want to create a
logical setup only for the sake of an upgrade, and trashes the
publisher after a failover to an upgraded subscriber node after the
latter has done a sync up of the data that's been added to the
relations tracked by the publications while the subscriber was
pg_upgrade'd.

Such a use case is possible to achieve even without this patch.
Sawada-San has already given an alternative to slightly tweak the
steps mentioned by Julien to achieve it. Also, there are other ways to
achieve it by slightly changing the steps. OTOH, it will create a
problem for normal logical replication set up after upgrade as
discused.

--
With Regards,
Amit Kapila.

#93Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#92)
Re: pg_upgrade and logical replication

On Fri, Sep 29, 2023 at 05:32:52PM +0530, Amit Kapila wrote:

Such a use case is possible to achieve even without this patch.
Sawada-San has already given an alternative to slightly tweak the
steps mentioned by Julien to achieve it. Also, there are other ways to
achieve it by slightly changing the steps. OTOH, it will create a
problem for normal logical replication set up after upgrade as
discused.

So, now that 29d0a77fa6 has been applied to the tree, would it be time
to brush up what's been discussed on this thread for subscribers? I'm
OK to spend time on it.
--
Michael

#94vignesh C
vignesh21@gmail.com
In reply to: Michael Paquier (#74)
2 attachment(s)
Re: pg_upgrade and logical replication

On Thu, 21 Sept 2023 at 11:27, Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Sep 15, 2023 at 03:08:21PM +0530, vignesh C wrote:

On Tue, 12 Sept 2023 at 14:25, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Is there a possibility that apply worker on old cluster connects to the
publisher during the upgrade? Regarding the pg_upgrade on publisher, the we
refuse TCP/IP connections from remotes and port number is also changed, so we can
assume that subscriber does not connect to. But IIUC such settings may not affect
to the connection source, so that the apply worker may try to connect to the
publisher. Also, is there any hazards if it happens?

Yes, there is a possibility that the apply worker gets started and new
transaction data is being synced from the publisher. I have made a fix
not to start the launcher process in binary ugprade mode as we don't
want the launcher to start apply worker during upgrade.

Hmm. I was wondering if 0001 is the right way to handle this case,
but at the end I'm OK to paint one extra isBinaryUpgrade in the code
path where apply launchers are registered. I don't think that the
patch is complete, though. A comment should be added in pg_upgrade's
server.c, exactly start_postmaster(), to tell that -b also stops apply
workers. I am attaching a version updated as of the attached, that
I'd be OK to apply.

I have added comments

I don't really think that we need to worry about a subscriber
connecting back to a publisher in this case, though? I mean, each
postmaster instance started by pg_upgrade restricts the access to the
instance with unix_socket_directories set to a custom path and
permissions at 0700, and a subscription's connection string does not
know the unix path used by pg_upgrade. I certainly agree that
stopping these processes could lead to inconsistencies in the data the
subscribers have been holding though, if we are not careful, so
preventing them from running is a good practice anyway.

I have made the fix similar to how upgrade publisher has done to keep
it consistent.

I have also reviewed 0002. As a whole, I think that I'm OK with the
main approach of the patch in pg_dump to use a new type of dumpable
object for subscription relations that are dumped with their upgrade
functions after. This still needs more work, and more documentation.

Added documentation

Also, perhaps we should really have an option to control if this part
of the copy happens or not. With a --no-subscription-relations for
pg_dump at least?

Currently this is done by default in binary upgrade mode, I will add a
separate patch to skip dump of subscription relations from upgrade and
dump a little later.

+{ oid => '4551', descr => 'add a relation with the specified relation state to pg_subscription_rel table',

During a development cycle, any new function added needs to use an OID
in range 8000-9999. Running unused_oids will suggest new random OIDs.

Modified

FWIW, I am not convinced that there is a need for two functions to add
an entry to pg_subscription_rel, with sole difference between both the
handling of a valid or invalid LSN. We should have only one function
that's able to handle NULL for the LSN. So let's remove rel_state_a
and rel_state_b, and have a single rel_state(). The description of
the SQL functions is inconsistent with the other binary upgrade ones,
I would suggest for the two functions
"for use by pg_upgrade (relation for pg_subscription_rel)"
"for use by pg_upgrade (remote_lsn for origin)"

Removed rel_state_a and rel_state_b and updated the description accordingly

+   i_srsublsn = PQfnumber(res, "srsublsn");
[...]
+       subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));

In getSubscriptionTables(), this should check for PQgetisnull()
because we would have a NULL value for InvalidXLogRecPtr in the
catalog. Using a char* for srsublsn is OK, but just assign NULL to
it, then just pass a hardcoded NULL value to the function as we do in
other places. So I don't quite get why this is not the same handling
as suboriginremotelsn.

Modified

getSubscriptionTables() is entirely skipped if we don't want any
subscriptions, if we deal with a server of 9.6 or older or if we don't
do binary upgrades, which is OK.

+/*
+ * getSubscriptionTables
+ *       get information about subscription membership for dumpable tables.
+ */
This commit is slightly misleading and should mention that this is an
upgrade-only path?

Modified

The code for dumpSubscriptionTable() is a copy-paste of
dumpPublicationTable(), but a lot of what you are doing here is
actually pointless if we are not in binary mode? Why should this code
path not taken only under dataOnly? I mean, this is a code path we
should never take except if we are in binary mode. This should have
at least a cross-check to make sure that we never have a
DO_SUBSCRIPTION_REL in this code path if we are in non-binary mode.

I have added an assert in this case, as it is not expected to come
here in non binary mode

+    if (dopt->binary_upgrade && subinfo->suboriginremotelsn)
+    {
+        appendPQExpBufferStr(query,
+                             "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+        appendStringLiteralAH(query, subinfo->dobj.name, fout);
+        appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+    }

Hmm.. Could it be actually useful even for debugging to still have
this query if suboriginremotelsn is an InvalidXLogRecPtr? I think
that this should have a comment of the kind "\n-- For binary upgrade,
blah". At least it would not be a bad thing to enforce a correct
state from the start, removing the NULL check for the second argument
in binary_upgrade_replorigin_advance().

Modified

+ /* We need to check for pg_replication_origin_status only once. */
Perhaps it would be better to explain why?

This remote_lsn code change is actually not required, I have removed this now.

+ "WHERE coalesce(remote_lsn, '0/0') = '0/0'"
Why a COALESCE here? Cannot this stuff just use NULL?

This remote_lsn code change is actually not required, I have removed this now.

+ fprintf(script, "database:%s subscription:%s relation:%s in non-ready state\n",
Could it be possible to include the schema of the relation in this log?

Modified

+static void check_for_subscription_state(ClusterInfo *cluster);
I'd be tempted to move that into a patch on its own, actually, for a
cleaner history.

As of now I have kept it together, I will change it later based on
more feedback from others

+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
New as of 2023.

Modified

+# Check that after upgradation of the subscriber server, the incremental
+# changes added to the publisher are replicated.
[..]
+   For upgradation of the subscriptions, all the subscriptions on the old
+   cluster must have a valid <varname>remote_lsn</varname>, and all the

Upgradation? I think that this should be reworded:
"All the subscriptions of an old cluster require a valid remote_lsn
during an upgrade."

This remote_lsn code change is actually not required, I have removed this now.

A CI run is reporting the following compilation warnings:
[04:21:15.290] pg_dump.c: In function ‘getSubscriptionTables’:
[04:21:15.290] pg_dump.c:4655:29: error: ‘subinfo’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
[04:21:15.290] 4655 | subrinfo[cur_rel].subinfo = subinfo;

I have initialized and checked with [-Werror=maybe-uninitialized],
let me check in the next cfbot run

+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+       "pg_upgrade_output.d/ not removed after pg_upgrade failure");
Not sure that there's a need for this check.  Okay, that's cheap.

Modified

And, err. We are going to need an option to control if the slot data
is copied, and a bit more documentation in pg_upgrade to explain how
things happen when the copy happens.

Added documentation for this, we will copy the slot data by default,
we will add a separate patch to skip dump of subscription
relations/replication slot from upgrade and dump a little later.

The attached v9 version patch has the changes for the same.

Apart from this I'm still checking that the old cluster's subscription
relations states are READY state still, but there is a possibility
that SYNCDONE or FINISHEDCOPY could work, this needs more thought
before concluding which is the correct state to check. Let' handle
this in the upcoming version.

Regards,
Vignesh

Attachments:

v9-0001-Prevent-startup-of-logical-replication-launcher-i.patchtext/x-patch; charset=US-ASCII; name=v9-0001-Prevent-startup-of-logical-replication-launcher-i.patchDownload
From 40cff73c7bd5d78eb05609986e25b1718fdbea0c Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Fri, 27 Oct 2023 11:18:28 +0530
Subject: [PATCH v9 1/2] Prevent startup of logical replication launcher in
 binary upgrade mode

The logical replication launcher may start apply workers during an
upgrade, which could be the cause of corruptions on a new cluster if
these are able to apply changes before the physical files are copied
over.

The chance of being able to do so should be small as pg_upgrade uses its
own port and unix domain directory (customizable as well with
--socketdir), but just preventing the launcher to start is safer at the
end, because we are then sure that no changes would ever be applied.

Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm2g9ZKf=y8X6z6MsLCuh8WwU-=Q6pLj35NFi2M5BZNS_A@mail.gmail.com
---
 src/bin/pg_upgrade/server.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index d7f6c268ef..9dedf63a87 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -248,9 +248,14 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	 * invalidation of slots during the upgrade. We set this option when
 	 * cluster is PG17 or later because logical replication slots can only be
 	 * migrated since then. Besides, max_slot_wal_keep_size is added in PG13.
+	 * We don't want the launcher to run while upgrading because it may start
+	 * apply workers which could start receiving changes from the publisher
+	 * before the physical files are put in place, causing corruption on the
+	 * new cluster upgrading to, so setting max_logical_replication_workers=0
+	 * to disable launcher.
 	 */
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
-		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
+		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1 -c max_logical_replication_workers=0");
 
 	/* Use -b to disable autovacuum. */
 	snprintf(cmd, sizeof(cmd),
-- 
2.34.1

v9-0002-Preserve-the-full-subscription-s-state-during-pg_.patchtext/x-patch; charset=US-ASCII; name=v9-0002-Preserve-the-full-subscription-s-state-during-pg_.patchDownload
From d4313513a3af89461f3b00e81761ab001fb8764a Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Fri, 27 Oct 2023 10:58:04 +0530
Subject: [PATCH v9 2/2] Preserve the full subscription's state during
 pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_create_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_create_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_create_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin are needed to ensure
that we don't replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription have a valid replication origin
remote_lsn, and that all underlying relations are in 'r' (ready) state, and
will error out if that's not the case, logging the reason for the failure.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  45 +++++
 src/backend/catalog/pg_subscription.c      |   2 +
 src/backend/utils/adt/pg_upgrade_support.c | 126 +++++++++++++
 src/bin/pg_dump/common.c                   |  22 +++
 src/bin/pg_dump/pg_dump.c                  | 198 +++++++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 |  76 ++++++++
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 200 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 ++
 src/tools/pgindent/typedefs.list           |   1 +
 12 files changed, 704 insertions(+), 4 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 46e8a0b746..280621389d 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,45 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Verify that all the subscription tables in the old subscriber are in
+     <literal>r</literal> (ready) state. Setup the
+     <link linkend="logical-replication-config-subscriber"> subscriber
+     configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription tables information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system table and the subscription replication origin which
+     will help in continuing logical replication from where the old subscriber
+     was replicating. This helps in avoiding the need for setting up the
+     subscription objects manually which requires truncating all the
+     subscription tables and setting the logical replication slots. Migration
+     of subscriber dependencies is only supported when the old cluster is
+     version 17.0 or later. Subscriber dependencies on clusters before version
+     17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There is a prerequisites that all the subscription tables should be in
+     <literal>r</literal> (ready) state for
+     <application>pg_upgrade</application> to be able to upgrade the
+     subscriber. If this is not met an error will be reported.
+    </para>
+
+    <para>
+     Enable the subscriptions by executing
+     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+    </para>
+    <para>
+     Create all the new tables that were created in the publication and
+     refresh the publication by executing
+     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+    </para>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
@@ -928,6 +967,12 @@ psql --username=postgres --file=script.sql postgres
    (<type>regclass</type>, <type>regrole</type>, and <type>regtype</type> can be upgraded.)
   </para>
 
+  <para>
+   For upgradation of the subscriptions, all the subscription tables should be
+   in <literal>r</literal> (ready) state, or else the
+   <application>pg_upgrade</application> run will error.
+  </para>
+
   <para>
    If you want to use link mode and you do not want your old cluster
    to be modified when the new cluster is started, consider using the clone mode.
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d6a978f136..492b34ff12 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -25,6 +25,8 @@
 #include "catalog/pg_type.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..e8b12adb3c 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,21 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +311,123 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_create_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * table.
+ */
+Datum
+binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_create_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+
+	if (PG_ARGISNULL(3))
+		sublsn = InvalidXLogRecPtr;
+	else
+		sublsn = PG_GETARG_LSN(3);
+
+	if (!OidIsValid(relid))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid relation identifier used: %u", relid));
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	sublsn;
+	char		originname[NAMEDATALEN];
+	RepOriginId originid;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+	if (PG_ARGISNULL(1))
+		sublsn = InvalidXLogRecPtr;
+	else
+		sublsn = PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+	originid = replorigin_by_name(originname, false);
+	replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7afdbf4d9d..900ddef064 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4581,6 +4582,99 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  get information about subscription membership for dumpable tables, this
+ *    will be used only in binary-upgrade mode.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i;
+	int			cur_rel = 0;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get subscription relation fields */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[cur_rel].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[cur_rel].dobj.catId.tableoid = relid;
+		subrinfo[cur_rel].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[cur_rel].dobj);
+		subrinfo[cur_rel].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[cur_rel].tblinfo = tblinfo;
+		subrinfo[cur_rel].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[cur_rel].srsublsn = NULL;
+		else
+			subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[cur_rel].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[cur_rel].dobj), fout);
+
+		cur_rel++;
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4606,6 +4700,7 @@ getSubscriptions(Archive *fout)
 	int			i_subpublications;
 	int			i_subbinary;
 	int			i_subpasswordrequired;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4660,15 +4755,19 @@ getSubscriptions(Archive *fout)
 	if (fout->remoteVersion >= 160000)
 		appendPQExpBufferStr(query,
 							 " s.suborigin,\n"
-							 " s.subpasswordrequired\n");
+							 " s.subpasswordrequired,\n");
 	else
 		appendPQExpBuffer(query,
 						  " '%s' AS suborigin,\n"
-						  " 't' AS subpasswordrequired\n",
+						  " 't' AS subpasswordrequired,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	appendPQExpBufferStr(query, "o.remote_lsn\n");
+
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n"
+						 "LEFT JOIN pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4694,6 +4793,7 @@ getSubscriptions(Archive *fout)
 	i_subdisableonerr = PQfnumber(res, "subdisableonerr");
 	i_suborigin = PQfnumber(res, "suborigin");
 	i_subpasswordrequired = PQfnumber(res, "subpasswordrequired");
+	i_suboriginremotelsn = PQfnumber(res, "remote_lsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4726,6 +4826,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
 		subinfo[i].subpasswordrequired =
 			pg_strdup(PQgetvalue(res, i, i_subpasswordrequired));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4735,6 +4840,80 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  dump the definition of the given subscription table mapping, this will be
+ *    used only for upgrade operation.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_create_sub_rel_state will add the subscription
+		 * relation to pg_subscripion_rel table, this is supported only for
+		 * upgrade operation.
+		 */
+		if (fout->remoteVersion >= 170000)
+		{
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_create_sub_rel_state(");
+			appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+			appendPQExpBuffer(query,
+							  ", %u, '%c'",
+							  subrinfo->tblinfo->dobj.catId.oid,
+							  subrinfo->srsubstate);
+
+			if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", '%s'",
+								  subrinfo->srsublsn);
+			else
+				appendPQExpBuffer(query, ", NULL");
+
+			appendPQExpBufferStr(query, ");\n");
+		}
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4812,6 +4991,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10430,6 +10620,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18496,6 +18689,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d8f27f187c..efc942283c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -670,8 +671,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *subpasswordrequired;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -696,6 +710,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -755,5 +770,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..4a4b91224d 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d)",
+					 obj->dumpId);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 179f85ae8a..e5a3112dce 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_state(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -112,6 +113,8 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
+	check_for_subscription_state(&old_cluster);
+
 	/*
 	 * Logical replication slots can be migrated since PG17. See comments atop
 	 * get_old_cluster_logical_slot_infos().
@@ -812,6 +815,79 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_state()
+ *
+ * Verify that each of the subscriptions have all their corresponding tables in
+ * ready state.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	/* Subscription relations state can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subscription_state.txt");
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, c.relname, n.nspname "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE srsubstate != 'r' "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+							output_path, strerror(errno));
+
+			fprintf(script, "database:%s subscription:%s schema:%s relation:%s in non-ready state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscription(s) with\n"
+				 "invalid remote_lsn or subscription relation(s) not in ready state.\n"
+				 "A list of subscription having invalid remote_lsn and/or\n"
+				 "subscription relation(s) not in ready state is in the file: %s",
+				 output_path);
+	}
+	else
+		check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 2c4f38d865..9bd6e5cbe1 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_upgrade_logical_replication_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..b495de96e3
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,200 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use Cwd qw(abs_path);
+use File::Basename qw(dirname);
+use File::Compare;
+use File::Find qw(find);
+use File::Path qw(rmtree);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::AdjustUpgrade;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $bindir = $new_sub->config_data('--bindir');
+
+sub insert_line
+{
+	my $payload = shift;
+
+	foreach ("t1", "t2")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("t1", "t2")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line('before initial sync');
+
+# Setup logical replication, replicating only 1 table
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR TABLE t1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$connstr' PUBLICATION pub");
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'sub');
+
+# ------------------------------------------------------
+# Check that pg_upgrade is succesful when all tables are in ready state.
+# ------------------------------------------------------
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r');";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with invalid remote_lsn');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Check the number of rows for each table on each server
+my $result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on the old subscriber");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0), "check initial t2 table data on the old subscriber");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if there's a subscription with tables in
+# a state different than 'r' (ready).
+# ------------------------------------------------------
+
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub DISABLE");
+
+# Set tables to 'i' state
+$old_sub->safe_psql(
+	'postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'i' WHERE srsubstate = 'r'");
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with incorrect sub rel');
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# ------------------------------------------------------
+# Check that pg_upgrade doesn't detect any problem once all the subscription's
+# relation are in 'r' (ready) state.
+# ------------------------------------------------------
+
+$old_sub->safe_psql(
+	'postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'r' WHERE srsubstate = 'i'");
+
+# ------------------------------------------------------
+# The incremental changes added to the publisher are replicated after upgrade.
+# ------------------------------------------------------
+
+# Stop the old subscriber, insert a row in each table while it's down and add
+# t2 to the publication
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+insert_line('while old_sub is down');
+
+# Run pg_upgrade
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+	],
+	'run of pg_upgrade for new sub');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade success");
+$publisher->safe_psql('postgres', "ALTER PUBLICATION pub ADD TABLE t2");
+
+$new_sub->start;
+
+# Subscription relations and replication origin remote_lsn should be preserved
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(1), "There should be 1 row in pg_subscription_rel");
+
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# There should be no new replicated rows before enabling the subscription
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1),
+	"t1 table has no new replicated rows before enabling the subscription");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in t2 table which is not part of the publication");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
+
+$publisher->wait_for_catchup('sub');
+
+# Rows on t1 should have been replicated, while nothing should happen for t2
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2), "check replicated inserts on new subscriber");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in table t2 afer enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, only the missing row on t2 should be replicated
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'sub');
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2),
+	"check there is no change when there was no changes replicated");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index bc41e92677..380ff107d3 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11375,6 +11375,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_create_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_create_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 87c1aee379..90b321945c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2656,6 +2656,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#95Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#94)
Re: pg_upgrade and logical replication

On Fri, Oct 27, 2023 at 12:09 PM vignesh C <vignesh21@gmail.com> wrote:

Apart from this I'm still checking that the old cluster's subscription
relations states are READY state still, but there is a possibility
that SYNCDONE or FINISHEDCOPY could work, this needs more thought
before concluding which is the correct state to check. Let' handle
this in the upcoming version.

I was analyzing this part and it seems it could be tricky to upgrade
in FINISHEDCOPY state. Because the system would expect that subscriber
would know the old slotname from oldcluster which it can drop at
SYNCDONE state. Now, as sync_slot_name is generated based on subid,
relid which could be different in the new cluster, the generated
slotname would be different after the upgrade. OTOH, if the relstate
is INIT, then I think the sync could be performed even after the
upgrade.

Shouldn't we at least ensure that replication origins do exist in the
old cluster corresponding to each of the subscriptions? Otherwise,
later the query to get remote_lsn for origin in getSubscriptions()
would fail.

--
With Regards,
Amit Kapila.

#96vignesh C
vignesh21@gmail.com
In reply to: vignesh C (#94)
2 attachment(s)
Re: pg_upgrade and logical replication

On Fri, 27 Oct 2023 at 12:09, vignesh C <vignesh21@gmail.com> wrote:

On Thu, 21 Sept 2023 at 11:27, Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Sep 15, 2023 at 03:08:21PM +0530, vignesh C wrote:

On Tue, 12 Sept 2023 at 14:25, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Is there a possibility that apply worker on old cluster connects to the
publisher during the upgrade? Regarding the pg_upgrade on publisher, the we
refuse TCP/IP connections from remotes and port number is also changed, so we can
assume that subscriber does not connect to. But IIUC such settings may not affect
to the connection source, so that the apply worker may try to connect to the
publisher. Also, is there any hazards if it happens?

Yes, there is a possibility that the apply worker gets started and new
transaction data is being synced from the publisher. I have made a fix
not to start the launcher process in binary ugprade mode as we don't
want the launcher to start apply worker during upgrade.

Hmm. I was wondering if 0001 is the right way to handle this case,
but at the end I'm OK to paint one extra isBinaryUpgrade in the code
path where apply launchers are registered. I don't think that the
patch is complete, though. A comment should be added in pg_upgrade's
server.c, exactly start_postmaster(), to tell that -b also stops apply
workers. I am attaching a version updated as of the attached, that
I'd be OK to apply.

I have added comments

I don't really think that we need to worry about a subscriber
connecting back to a publisher in this case, though? I mean, each
postmaster instance started by pg_upgrade restricts the access to the
instance with unix_socket_directories set to a custom path and
permissions at 0700, and a subscription's connection string does not
know the unix path used by pg_upgrade. I certainly agree that
stopping these processes could lead to inconsistencies in the data the
subscribers have been holding though, if we are not careful, so
preventing them from running is a good practice anyway.

I have made the fix similar to how upgrade publisher has done to keep
it consistent.

I have also reviewed 0002. As a whole, I think that I'm OK with the
main approach of the patch in pg_dump to use a new type of dumpable
object for subscription relations that are dumped with their upgrade
functions after. This still needs more work, and more documentation.

Added documentation

Also, perhaps we should really have an option to control if this part
of the copy happens or not. With a --no-subscription-relations for
pg_dump at least?

Currently this is done by default in binary upgrade mode, I will add a
separate patch to skip dump of subscription relations from upgrade and
dump a little later.

+{ oid => '4551', descr => 'add a relation with the specified relation state to pg_subscription_rel table',

During a development cycle, any new function added needs to use an OID
in range 8000-9999. Running unused_oids will suggest new random OIDs.

Modified

FWIW, I am not convinced that there is a need for two functions to add
an entry to pg_subscription_rel, with sole difference between both the
handling of a valid or invalid LSN. We should have only one function
that's able to handle NULL for the LSN. So let's remove rel_state_a
and rel_state_b, and have a single rel_state(). The description of
the SQL functions is inconsistent with the other binary upgrade ones,
I would suggest for the two functions
"for use by pg_upgrade (relation for pg_subscription_rel)"
"for use by pg_upgrade (remote_lsn for origin)"

Removed rel_state_a and rel_state_b and updated the description accordingly

+   i_srsublsn = PQfnumber(res, "srsublsn");
[...]
+       subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));

In getSubscriptionTables(), this should check for PQgetisnull()
because we would have a NULL value for InvalidXLogRecPtr in the
catalog. Using a char* for srsublsn is OK, but just assign NULL to
it, then just pass a hardcoded NULL value to the function as we do in
other places. So I don't quite get why this is not the same handling
as suboriginremotelsn.

Modified

getSubscriptionTables() is entirely skipped if we don't want any
subscriptions, if we deal with a server of 9.6 or older or if we don't
do binary upgrades, which is OK.

+/*
+ * getSubscriptionTables
+ *       get information about subscription membership for dumpable tables.
+ */
This commit is slightly misleading and should mention that this is an
upgrade-only path?

Modified

The code for dumpSubscriptionTable() is a copy-paste of
dumpPublicationTable(), but a lot of what you are doing here is
actually pointless if we are not in binary mode? Why should this code
path not taken only under dataOnly? I mean, this is a code path we
should never take except if we are in binary mode. This should have
at least a cross-check to make sure that we never have a
DO_SUBSCRIPTION_REL in this code path if we are in non-binary mode.

I have added an assert in this case, as it is not expected to come
here in non binary mode

+    if (dopt->binary_upgrade && subinfo->suboriginremotelsn)
+    {
+        appendPQExpBufferStr(query,
+                             "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+        appendStringLiteralAH(query, subinfo->dobj.name, fout);
+        appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+    }

Hmm.. Could it be actually useful even for debugging to still have
this query if suboriginremotelsn is an InvalidXLogRecPtr? I think
that this should have a comment of the kind "\n-- For binary upgrade,
blah". At least it would not be a bad thing to enforce a correct
state from the start, removing the NULL check for the second argument
in binary_upgrade_replorigin_advance().

Modified

+ /* We need to check for pg_replication_origin_status only once. */
Perhaps it would be better to explain why?

This remote_lsn code change is actually not required, I have removed this now.

+ "WHERE coalesce(remote_lsn, '0/0') = '0/0'"
Why a COALESCE here? Cannot this stuff just use NULL?

This remote_lsn code change is actually not required, I have removed this now.

+ fprintf(script, "database:%s subscription:%s relation:%s in non-ready state\n",
Could it be possible to include the schema of the relation in this log?

Modified

+static void check_for_subscription_state(ClusterInfo *cluster);
I'd be tempted to move that into a patch on its own, actually, for a
cleaner history.

As of now I have kept it together, I will change it later based on
more feedback from others

+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
New as of 2023.

Modified

+# Check that after upgradation of the subscriber server, the incremental
+# changes added to the publisher are replicated.
[..]
+   For upgradation of the subscriptions, all the subscriptions on the old
+   cluster must have a valid <varname>remote_lsn</varname>, and all the

Upgradation? I think that this should be reworded:
"All the subscriptions of an old cluster require a valid remote_lsn
during an upgrade."

This remote_lsn code change is actually not required, I have removed this now.

A CI run is reporting the following compilation warnings:
[04:21:15.290] pg_dump.c: In function ‘getSubscriptionTables’:
[04:21:15.290] pg_dump.c:4655:29: error: ‘subinfo’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
[04:21:15.290] 4655 | subrinfo[cur_rel].subinfo = subinfo;

I have initialized and checked with [-Werror=maybe-uninitialized],
let me check in the next cfbot run

+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+       "pg_upgrade_output.d/ not removed after pg_upgrade failure");
Not sure that there's a need for this check.  Okay, that's cheap.

Modified

And, err. We are going to need an option to control if the slot data
is copied, and a bit more documentation in pg_upgrade to explain how
things happen when the copy happens.

Added documentation for this, we will copy the slot data by default,
we will add a separate patch to skip dump of subscription
relations/replication slot from upgrade and dump a little later.

The attached v9 version patch has the changes for the same.

Apart from this I'm still checking that the old cluster's subscription
relations states are READY state still, but there is a possibility
that SYNCDONE or FINISHEDCOPY could work, this needs more thought
before concluding which is the correct state to check. Let' handle
this in the upcoming version.

The patch was not applying because of recent commits. Here is a
rebased version of the patches.

Regards,
Vignesh

Attachments:

v9_20231030-0001-Prevent-startup-of-logical-replication-launcher-i.patchtext/x-patch; charset=US-ASCII; name=v9_20231030-0001-Prevent-startup-of-logical-replication-launcher-i.patchDownload
From ab682bfcfbad5e70e288647bb7ebf27d7c5abd74 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Fri, 27 Oct 2023 11:18:28 +0530
Subject: [PATCH v10 1/2] Prevent startup of logical replication launcher in
 binary upgrade mode

The logical replication launcher may start apply workers during an
upgrade, which could be the cause of corruptions on a new cluster if
these are able to apply changes before the physical files are copied
over.

The chance of being able to do so should be small as pg_upgrade uses its
own port and unix domain directory (customizable as well with
--socketdir), but just preventing the launcher to start is safer at the
end, because we are then sure that no changes would ever be applied.

Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm2g9ZKf=y8X6z6MsLCuh8WwU-=Q6pLj35NFi2M5BZNS_A@mail.gmail.com
---
 src/bin/pg_upgrade/server.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index d7f6c268ef..9dedf63a87 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -248,9 +248,14 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	 * invalidation of slots during the upgrade. We set this option when
 	 * cluster is PG17 or later because logical replication slots can only be
 	 * migrated since then. Besides, max_slot_wal_keep_size is added in PG13.
+	 * We don't want the launcher to run while upgrading because it may start
+	 * apply workers which could start receiving changes from the publisher
+	 * before the physical files are put in place, causing corruption on the
+	 * new cluster upgrading to, so setting max_logical_replication_workers=0
+	 * to disable launcher.
 	 */
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
-		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
+		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1 -c max_logical_replication_workers=0");
 
 	/* Use -b to disable autovacuum. */
 	snprintf(cmd, sizeof(cmd),
-- 
2.34.1

v9_20231030-0002-Preserve-the-full-subscription-s-state-during-pg_.patchtext/x-patch; charset=US-ASCII; name=v9_20231030-0002-Preserve-the-full-subscription-s-state-during-pg_.patchDownload
From 440f0b8eca3ad8e3de537828edd52f031143d029 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v10 2/2] Preserve the full subscription's state during
 pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_create_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_create_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_create_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin are needed to ensure
that we don't replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription have a valid replication origin
remote_lsn, and that all underlying relations are in 'r' (ready) state, and
will error out if that's not the case, logging the reason for the failure.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  45 +++++
 src/backend/catalog/pg_subscription.c      |   2 +
 src/backend/utils/adt/pg_upgrade_support.c | 126 +++++++++++++
 src/bin/pg_dump/common.c                   |  22 +++
 src/bin/pg_dump/pg_dump.c                  | 197 +++++++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 |  76 ++++++++
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 200 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 ++
 src/tools/pgindent/typedefs.list           |   1 +
 12 files changed, 703 insertions(+), 4 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 46e8a0b746..280621389d 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,45 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Verify that all the subscription tables in the old subscriber are in
+     <literal>r</literal> (ready) state. Setup the
+     <link linkend="logical-replication-config-subscriber"> subscriber
+     configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription tables information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system table and the subscription replication origin which
+     will help in continuing logical replication from where the old subscriber
+     was replicating. This helps in avoiding the need for setting up the
+     subscription objects manually which requires truncating all the
+     subscription tables and setting the logical replication slots. Migration
+     of subscriber dependencies is only supported when the old cluster is
+     version 17.0 or later. Subscriber dependencies on clusters before version
+     17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There is a prerequisites that all the subscription tables should be in
+     <literal>r</literal> (ready) state for
+     <application>pg_upgrade</application> to be able to upgrade the
+     subscriber. If this is not met an error will be reported.
+    </para>
+
+    <para>
+     Enable the subscriptions by executing
+     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+    </para>
+    <para>
+     Create all the new tables that were created in the publication and
+     refresh the publication by executing
+     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+    </para>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
@@ -928,6 +967,12 @@ psql --username=postgres --file=script.sql postgres
    (<type>regclass</type>, <type>regrole</type>, and <type>regtype</type> can be upgraded.)
   </para>
 
+  <para>
+   For upgradation of the subscriptions, all the subscription tables should be
+   in <literal>r</literal> (ready) state, or else the
+   <application>pg_upgrade</application> run will error.
+  </para>
+
   <para>
    If you want to use link mode and you do not want your old cluster
    to be modified when the new cluster is started, consider using the clone mode.
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d6a978f136..492b34ff12 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -25,6 +25,8 @@
 #include "catalog/pg_type.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..e8b12adb3c 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,21 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +311,123 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_create_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * table.
+ */
+Datum
+binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_create_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+
+	if (PG_ARGISNULL(3))
+		sublsn = InvalidXLogRecPtr;
+	else
+		sublsn = PG_GETARG_LSN(3);
+
+	if (!OidIsValid(relid))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid relation identifier used: %u", relid));
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	sublsn;
+	char		originname[NAMEDATALEN];
+	RepOriginId originid;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+	if (PG_ARGISNULL(1))
+		sublsn = InvalidXLogRecPtr;
+	else
+		sublsn = PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+	originid = replorigin_by_name(originname, false);
+	replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e863913849..a81d1384a4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4581,6 +4582,99 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  get information about subscription membership for dumpable tables, this
+ *    will be used only in binary-upgrade mode.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i;
+	int			cur_rel = 0;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get subscription relation fields */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[cur_rel].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[cur_rel].dobj.catId.tableoid = relid;
+		subrinfo[cur_rel].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[cur_rel].dobj);
+		subrinfo[cur_rel].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[cur_rel].tblinfo = tblinfo;
+		subrinfo[cur_rel].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[cur_rel].srsublsn = NULL;
+		else
+			subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[cur_rel].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[cur_rel].dobj), fout);
+
+		cur_rel++;
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4607,6 +4701,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4662,16 +4757,19 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	appendPQExpBufferStr(query, "o.remote_lsn\n");
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n"
+						 "LEFT JOIN pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4698,6 +4796,7 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "remote_lsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4735,6 +4834,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4744,6 +4848,80 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  dump the definition of the given subscription table mapping, this will be
+ *    used only for upgrade operation.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_create_sub_rel_state will add the subscription
+		 * relation to pg_subscripion_rel table, this is supported only for
+		 * upgrade operation.
+		 */
+		if (fout->remoteVersion >= 170000)
+		{
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_create_sub_rel_state(");
+			appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+			appendPQExpBuffer(query,
+							  ", %u, '%c'",
+							  subrinfo->tblinfo->dobj.catId.oid,
+							  subrinfo->srsubstate);
+
+			if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", '%s'",
+								  subrinfo->srsublsn);
+			else
+				appendPQExpBuffer(query, ", NULL");
+
+			appendPQExpBufferStr(query, ");\n");
+		}
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4824,6 +5002,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10442,6 +10631,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18508,6 +18700,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..3012da5b49 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -671,8 +672,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char       *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char            srsubstate;
+	char       *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +711,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +771,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..4a4b91224d 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d)",
+					 obj->dumpId);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..2fe5220fa3 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_state(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -112,6 +113,8 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
+	check_for_subscription_state(&old_cluster);
+
 	/*
 	 * Logical replication slots can be migrated since PG17. See comments atop
 	 * get_old_cluster_logical_slot_infos().
@@ -812,6 +815,79 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_state()
+ *
+ * Verify that each of the subscriptions have all their corresponding tables in
+ * ready state.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	/* Subscription relations state can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subscription_state.txt");
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, c.relname, n.nspname "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE srsubstate != 'r' "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+							output_path, strerror(errno));
+
+			fprintf(script, "database:%s subscription:%s schema:%s relation:%s in non-ready state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscription(s) with\n"
+				 "invalid remote_lsn or subscription relation(s) not in ready state.\n"
+				 "A list of subscription having invalid remote_lsn and/or\n"
+				 "subscription relation(s) not in ready state is in the file: %s",
+				 output_path);
+	}
+	else
+		check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..b495de96e3
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,200 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use Cwd qw(abs_path);
+use File::Basename qw(dirname);
+use File::Compare;
+use File::Find qw(find);
+use File::Path qw(rmtree);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::AdjustUpgrade;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $bindir = $new_sub->config_data('--bindir');
+
+sub insert_line
+{
+	my $payload = shift;
+
+	foreach ("t1", "t2")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("t1", "t2")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line('before initial sync');
+
+# Setup logical replication, replicating only 1 table
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR TABLE t1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$connstr' PUBLICATION pub");
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'sub');
+
+# ------------------------------------------------------
+# Check that pg_upgrade is succesful when all tables are in ready state.
+# ------------------------------------------------------
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r');";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with invalid remote_lsn');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Check the number of rows for each table on each server
+my $result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on the old subscriber");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0), "check initial t2 table data on the old subscriber");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if there's a subscription with tables in
+# a state different than 'r' (ready).
+# ------------------------------------------------------
+
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub DISABLE");
+
+# Set tables to 'i' state
+$old_sub->safe_psql(
+	'postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'i' WHERE srsubstate = 'r'");
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with incorrect sub rel');
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# ------------------------------------------------------
+# Check that pg_upgrade doesn't detect any problem once all the subscription's
+# relation are in 'r' (ready) state.
+# ------------------------------------------------------
+
+$old_sub->safe_psql(
+	'postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'r' WHERE srsubstate = 'i'");
+
+# ------------------------------------------------------
+# The incremental changes added to the publisher are replicated after upgrade.
+# ------------------------------------------------------
+
+# Stop the old subscriber, insert a row in each table while it's down and add
+# t2 to the publication
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+insert_line('while old_sub is down');
+
+# Run pg_upgrade
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+	],
+	'run of pg_upgrade for new sub');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade success");
+$publisher->safe_psql('postgres', "ALTER PUBLICATION pub ADD TABLE t2");
+
+$new_sub->start;
+
+# Subscription relations and replication origin remote_lsn should be preserved
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(1), "There should be 1 row in pg_subscription_rel");
+
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# There should be no new replicated rows before enabling the subscription
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1),
+	"t1 table has no new replicated rows before enabling the subscription");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in t2 table which is not part of the publication");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
+
+$publisher->wait_for_catchup('sub');
+
+# Rows on t1 should have been replicated, while nothing should happen for t2
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2), "check replicated inserts on new subscriber");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in table t2 afer enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, only the missing row on t2 should be replicated
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'sub');
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2),
+	"check there is no change when there was no changes replicated");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 568aa80d92..40ad5f2fb9 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11379,6 +11379,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_create_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_create_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 87c1aee379..90b321945c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2656,6 +2656,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#97Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#95)
Re: pg_upgrade and logical replication

On Fri, Oct 27, 2023 at 05:05:39PM +0530, Amit Kapila wrote:

I was analyzing this part and it seems it could be tricky to upgrade
in FINISHEDCOPY state. Because the system would expect that subscriber
would know the old slotname from oldcluster which it can drop at
SYNCDONE state. Now, as sync_slot_name is generated based on subid,
relid which could be different in the new cluster, the generated
slotname would be different after the upgrade. OTOH, if the relstate
is INIT, then I think the sync could be performed even after the
upgrade.

TBH, I am really wondering if there is any need to go down to being
able to handle anything else than READY for the relation states in
pg_subscription_rel. One reason is that it makes it much easier to
think about how to handle these in parallel of a node with
publications that also need to go through an upgrade, because as READY
relations they don't require any tracking. IMO, this makes it simpler
to think about cases where a node holds both subscriptions and
publications.

FWIW, my take is that it feels natural to do the upgrades of
subscriptions first, creating a similarity with the case of minor
updates with physical replication setups.

Shouldn't we at least ensure that replication origins do exist in the
old cluster corresponding to each of the subscriptions? Otherwise,
later the query to get remote_lsn for origin in getSubscriptions()
would fail.

You mean in the shape of a pre-upgrade check making sure that
pg_replication_origin_status has entries for all the subscriptions we
expect to see during the upgrade? Makes sense to me.
--
Michael

#98Michael Paquier
michael@paquier.xyz
In reply to: vignesh C (#96)
Re: pg_upgrade and logical replication

On Mon, Oct 30, 2023 at 03:05:09PM +0530, vignesh C wrote:

The patch was not applying because of recent commits. Here is a
rebased version of the patches.

+     * We don't want the launcher to run while upgrading because it may start
+     * apply workers which could start receiving changes from the publisher
+     * before the physical files are put in place, causing corruption on the
+     * new cluster upgrading to, so setting max_logical_replication_workers=0
+     * to disable launcher.
      */
     if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
-        appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
+        appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1 -c max_logical_replication_workers=0");

At least that's consistent with the other side of the coin with
publications. So 0001 looks basically OK seen from here.

The indentation of 0002 seems off in a few places.

+    <para>
+     Verify that all the subscription tables in the old subscriber are in
+     <literal>r</literal> (ready) state. Setup the
+     <link linkend="logical-replication-config-subscriber"> subscriber
+     configurations</link> in the new subscriber.
[...]
+    <para>
+     There is a prerequisites that all the subscription tables should be in
+     <literal>r</literal> (ready) state for
+     <application>pg_upgrade</application> to be able to upgrade the
+     subscriber. If this is not met an error will be reported.
+    </para>

This part is repeated. Globally, this documentation addition does not
seem really helpful for the end-user as it describes the checks that
are done during the upgrade. Shouldn't this part of the docs,
similarly to the publication part, focus on providing a check list of
actions to take to achieve a clean upgrade, with a list of commands
and configurations required? The good part is that information about
what's copied is provided (pg_subscription_rel and the origin status),
still this could be improved.

+    <para>
+     Enable the subscriptions by executing
+     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+    </para>

This is something users can act on, but how does this operation help
with the upgrade? Should this happen for all the descriptions
subscriptions? Or you mean that this is something that needs to be
run after the upgrade?

+    <para>
+     Create all the new tables that were created in the publication and
+     refresh the publication by executing
+     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+    </para>

What does "new tables" refer to in this case? Are you referring to
the case where new relations have been added on a publication node
after an upgrade and need to be copied? Does one need to DISABLE the
subscriptions on the subscriber node before running the upgrade, or is
a REFRESH enough? The test only uses a REFRESH, so the docs and the
code don't entirely agree with each other.

+  <para>
+   For upgradation of the subscriptions, all the subscription tables should be
+   in <literal>r</literal> (ready) state, or else the
+   <application>pg_upgrade</application> run will error.
+  </para>

"Upgradation"?

+# Set tables to 'i' state
+$old_sub->safe_psql(
+	'postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'i' WHERE srsubstate = 'r'");

I am not sure that doing catalog manipulation in the TAP test itself
is a good idea, because this can finish by being unpredictible in the
long-term for the test maintenance. I think that this portion of the
test should just be removed. poll_query_until() or wait queries
making sure that all the relations are in the state we want them to be
before the beginning of the upgrade is enough in terms of test
coverag, IMO.

+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");

This assumes one row, but perhaps this had better do a match based on
external_id and/or local_id?
--
Michael

#99vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#95)
2 attachment(s)
Re: pg_upgrade and logical replication

On Fri, 27 Oct 2023 at 17:05, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 27, 2023 at 12:09 PM vignesh C <vignesh21@gmail.com> wrote:

Apart from this I'm still checking that the old cluster's subscription
relations states are READY state still, but there is a possibility
that SYNCDONE or FINISHEDCOPY could work, this needs more thought
before concluding which is the correct state to check. Let' handle
this in the upcoming version.

I was analyzing this part and it seems it could be tricky to upgrade
in FINISHEDCOPY state. Because the system would expect that subscriber
would know the old slotname from oldcluster which it can drop at
SYNCDONE state. Now, as sync_slot_name is generated based on subid,
relid which could be different in the new cluster, the generated
slotname would be different after the upgrade. OTOH, if the relstate
is INIT, then I think the sync could be performed even after the
upgrade.

I had analyzed all the subscription relation states further, here is
my analysis:
The following states are ok, as either the replication slot is not
created or the replication slot is already dropped and the required
WAL files will be present in the publisher:
a) SUBREL_STATE_SYNCDONE b) SUBREL_STATE_READY c) SUBREL_STATE_INIT
The following states are not ok as the worker has dependency on the
replication slot/origin in these case:
a) SUBREL_STATE_DATASYNC: In this case, the table sync worker will try
to drop the replication slot but as the replication slots will be
created with old subscription id in the publisher and the upgraded
subscriber will not be able to clean the slots in this case. b)
SUBREL_STATE_FINISHEDCOPY: In this case, the tablesync worker will
expect the origin to be already existing as the origin is created with
an old subscription id, tablesync worker will not be able to find the
origin in this case. c) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP
and SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
so we need not allow these states.
I modified it to support the relation states accordingly.

Shouldn't we at least ensure that replication origins do exist in the
old cluster corresponding to each of the subscriptions? Otherwise,
later the query to get remote_lsn for origin in getSubscriptions()
would fail.

Added a check for the same.

The attached v10 version patch has the changes for the same.

Regards,
Vignesh

Attachments:

v10-0001-Prevent-startup-of-logical-replication-launcher-.patchtext/x-patch; charset=US-ASCII; name=v10-0001-Prevent-startup-of-logical-replication-launcher-.patchDownload
From 779fc68636da8c901670574c7284726e0a0c58ac Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Fri, 27 Oct 2023 11:18:28 +0530
Subject: [PATCH v10 1/2] Prevent startup of logical replication launcher in
 binary upgrade mode

The logical replication launcher may start apply workers during an
upgrade, which could be the cause of corruptions on a new cluster if
these are able to apply changes before the physical files are copied
over.

The chance of being able to do so should be small as pg_upgrade uses its
own port and unix domain directory (customizable as well with
--socketdir), but just preventing the launcher to start is safer at the
end, because we are then sure that no changes would ever be applied.

Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm2g9ZKf=y8X6z6MsLCuh8WwU-=Q6pLj35NFi2M5BZNS_A@mail.gmail.com
---
 src/bin/pg_upgrade/server.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index d7f6c268ef..9dedf63a87 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -248,9 +248,14 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	 * invalidation of slots during the upgrade. We set this option when
 	 * cluster is PG17 or later because logical replication slots can only be
 	 * migrated since then. Besides, max_slot_wal_keep_size is added in PG13.
+	 * We don't want the launcher to run while upgrading because it may start
+	 * apply workers which could start receiving changes from the publisher
+	 * before the physical files are put in place, causing corruption on the
+	 * new cluster upgrading to, so setting max_logical_replication_workers=0
+	 * to disable launcher.
 	 */
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
-		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
+		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1 -c max_logical_replication_workers=0");
 
 	/* Use -b to disable autovacuum. */
 	snprintf(cmd, sizeof(cmd),
-- 
2.34.1

v10-0002-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v10-0002-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From 7e9440e47d1d0844ffd6647d4e341a07f033de86 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v10 2/2] Preserve the full subscription's state during
 pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_create_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_create_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_create_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin are needed to ensure
that we don't replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init), 's' (data sync) or in 'r' (ready) state, and
will error out if that's not the case, logging the reason for the failure.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  45 +++++
 src/backend/catalog/pg_subscription.c      |   2 +
 src/backend/utils/adt/pg_upgrade_support.c | 126 +++++++++++++
 src/bin/pg_dump/common.c                   |  22 +++
 src/bin/pg_dump/pg_dump.c                  | 197 +++++++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 105 +++++++++++
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 200 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 ++
 src/tools/pgindent/typedefs.list           |   1 +
 12 files changed, 732 insertions(+), 4 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 46e8a0b746..280621389d 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,45 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Verify that all the subscription tables in the old subscriber are in
+     <literal>r</literal> (ready) state. Setup the
+     <link linkend="logical-replication-config-subscriber"> subscriber
+     configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription tables information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system table and the subscription replication origin which
+     will help in continuing logical replication from where the old subscriber
+     was replicating. This helps in avoiding the need for setting up the
+     subscription objects manually which requires truncating all the
+     subscription tables and setting the logical replication slots. Migration
+     of subscriber dependencies is only supported when the old cluster is
+     version 17.0 or later. Subscriber dependencies on clusters before version
+     17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There is a prerequisites that all the subscription tables should be in
+     <literal>r</literal> (ready) state for
+     <application>pg_upgrade</application> to be able to upgrade the
+     subscriber. If this is not met an error will be reported.
+    </para>
+
+    <para>
+     Enable the subscriptions by executing
+     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+    </para>
+    <para>
+     Create all the new tables that were created in the publication and
+     refresh the publication by executing
+     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+    </para>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
@@ -928,6 +967,12 @@ psql --username=postgres --file=script.sql postgres
    (<type>regclass</type>, <type>regrole</type>, and <type>regtype</type> can be upgraded.)
   </para>
 
+  <para>
+   For upgradation of the subscriptions, all the subscription tables should be
+   in <literal>r</literal> (ready) state, or else the
+   <application>pg_upgrade</application> run will error.
+  </para>
+
   <para>
    If you want to use link mode and you do not want your old cluster
    to be modified when the new cluster is started, consider using the clone mode.
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d6a978f136..492b34ff12 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -25,6 +25,8 @@
 #include "catalog/pg_type.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..e8b12adb3c 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,21 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +311,123 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_create_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * table.
+ */
+Datum
+binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_create_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+
+	if (PG_ARGISNULL(3))
+		sublsn = InvalidXLogRecPtr;
+	else
+		sublsn = PG_GETARG_LSN(3);
+
+	if (!OidIsValid(relid))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid relation identifier used: %u", relid));
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	sublsn;
+	char		originname[NAMEDATALEN];
+	RepOriginId originid;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+	if (PG_ARGISNULL(1))
+		sublsn = InvalidXLogRecPtr;
+	else
+		sublsn = PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+	originid = replorigin_by_name(originname, false);
+	replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e863913849..a81d1384a4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4581,6 +4582,99 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  get information about subscription membership for dumpable tables, this
+ *    will be used only in binary-upgrade mode.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i;
+	int			cur_rel = 0;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get subscription relation fields */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[cur_rel].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[cur_rel].dobj.catId.tableoid = relid;
+		subrinfo[cur_rel].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[cur_rel].dobj);
+		subrinfo[cur_rel].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[cur_rel].tblinfo = tblinfo;
+		subrinfo[cur_rel].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[cur_rel].srsublsn = NULL;
+		else
+			subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[cur_rel].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[cur_rel].dobj), fout);
+
+		cur_rel++;
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4607,6 +4701,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4662,16 +4757,19 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	appendPQExpBufferStr(query, "o.remote_lsn\n");
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n"
+						 "LEFT JOIN pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4698,6 +4796,7 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "remote_lsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4735,6 +4834,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4744,6 +4848,80 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  dump the definition of the given subscription table mapping, this will be
+ *    used only for upgrade operation.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_create_sub_rel_state will add the subscription
+		 * relation to pg_subscripion_rel table, this is supported only for
+		 * upgrade operation.
+		 */
+		if (fout->remoteVersion >= 170000)
+		{
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_create_sub_rel_state(");
+			appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+			appendPQExpBuffer(query,
+							  ", %u, '%c'",
+							  subrinfo->tblinfo->dobj.catId.oid,
+							  subrinfo->srsubstate);
+
+			if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", '%s'",
+								  subrinfo->srsublsn);
+			else
+				appendPQExpBuffer(query, ", NULL");
+
+			appendPQExpBufferStr(query, ");\n");
+		}
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4824,6 +5002,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10442,6 +10631,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18508,6 +18700,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..3012da5b49 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -671,8 +672,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char       *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char            srsubstate;
+	char       *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +711,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +771,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..4a4b91224d 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d)",
+					 obj->dumpId);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..9dd34f742b 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_state(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -112,6 +113,8 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
+	check_for_subscription_state(&old_cluster);
+
 	/*
 	 * Logical replication slots can be migrated since PG17. See comments atop
 	 * get_old_cluster_logical_slot_infos().
@@ -812,6 +815,108 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_state()
+ *
+ * Verify that each of the subscriptions have all their corresponding tables in
+ * ready state.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	/* Subscription relations state can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subscription_state.txt");
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:%s subscription:%s\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, c.relname, n.nspname "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE srsubstate NOT IN ('i', 's', 'r') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+							output_path, strerror(errno));
+
+			fprintf(script, "database:%s subscription:%s schema:%s relation:%s in non-ready state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscription(s) with\n"
+				 "Subscription not having origin and/or subscription relation(s) not in ready state.\n"
+				 "A list of subscription not having origin and/or\n"
+				 "subscription relation(s) not in ready state is in the file: %s",
+				 output_path);
+	}
+	else
+		check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..60ac1924bd
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,200 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use Cwd qw(abs_path);
+use File::Basename qw(dirname);
+use File::Compare;
+use File::Find qw(find);
+use File::Path qw(rmtree);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::AdjustUpgrade;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $bindir = $new_sub->config_data('--bindir');
+
+sub insert_line
+{
+	my $payload = shift;
+
+	foreach ("t1", "t2")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("t1", "t2")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line('before initial sync');
+
+# Setup logical replication, replicating only 1 table
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR TABLE t1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$connstr' PUBLICATION pub");
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'sub');
+
+# ------------------------------------------------------
+# Check that pg_upgrade is succesful when all tables are in ready state.
+# ------------------------------------------------------
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r');";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with invalid remote_lsn');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Check the number of rows for each table on each server
+my $result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on the old subscriber");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0), "check initial t2 table data on the old subscriber");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if there's a subscription with tables in
+# a state different than 'r' (ready), 'i' (init) and 's' (data sync).
+# ------------------------------------------------------
+
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub DISABLE");
+
+# Set tables to 'd' state
+$old_sub->safe_psql(
+	'postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'd' WHERE srsubstate = 'r'");
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with incorrect sub rel');
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# ------------------------------------------------------
+# Check that pg_upgrade doesn't detect any problem once all the subscription's
+# relation are in 'r' (ready) state.
+# ------------------------------------------------------
+
+$old_sub->safe_psql(
+	'postgres',
+	"UPDATE pg_subscription_rel
+		SET srsubstate = 'r' WHERE srsubstate = 'd'");
+
+# ------------------------------------------------------
+# The incremental changes added to the publisher are replicated after upgrade.
+# ------------------------------------------------------
+
+# Stop the old subscriber, insert a row in each table while it's down and add
+# t2 to the publication
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+insert_line('while old_sub is down');
+
+# Run pg_upgrade
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+	],
+	'run of pg_upgrade for new sub');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade success");
+$publisher->safe_psql('postgres', "ALTER PUBLICATION pub ADD TABLE t2");
+
+$new_sub->start;
+
+# Subscription relations and replication origin remote_lsn should be preserved
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(1), "There should be 1 row in pg_subscription_rel");
+
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# There should be no new replicated rows before enabling the subscription
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1),
+	"t1 table has no new replicated rows before enabling the subscription");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in t2 table which is not part of the publication");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
+
+$publisher->wait_for_catchup('sub');
+
+# Rows on t1 should have been replicated, while nothing should happen for t2
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2), "check replicated inserts on new subscriber");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in table t2 afer enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, only the missing row on t2 should be replicated
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'sub');
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2),
+	"check there is no change when there was no changes replicated");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 568aa80d92..40ad5f2fb9 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11379,6 +11379,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_create_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_create_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 87c1aee379..90b321945c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2656,6 +2656,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#100Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#99)
Re: pg_upgrade and logical replication

Here are some review comments for patch v10-0001

======
Commit message

1.
The chance of being able to do so should be small as pg_upgrade uses its
own port and unix domain directory (customizable as well with
--socketdir), but just preventing the launcher to start is safer at the
end, because we are then sure that no changes would ever be applied.

~

"safer at the end" (??)

======
src/bin/pg_upgrade/server.c

2.
+ * We don't want the launcher to run while upgrading because it may start
+ * apply workers which could start receiving changes from the publisher
+ * before the physical files are put in place, causing corruption on the
+ * new cluster upgrading to, so setting max_logical_replication_workers=0
+ * to disable launcher.
  */
  if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
- appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
+ appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1 -c
max_logical_replication_workers=0");

2a.
The comment is one big long sentence. IMO it will be better to break it up.

~

2b.
Add a blank line between this comment note and the previous one.

~~~

2c.
In a recent similar thread [1]/messages/by-id/20231027.115759.2206827438943188717.horikyota.ntt@gmail.com, they chose to implement a guc_hook to
prevent a user from overriding this via the command line option during
the upgrade. Shouldn't this patch do the same thing, for consistency?

~~~

2d.
If you do implement such a guc_hook (per #2c above), then should the
patch also include a test case for getting an ERROR if the user tries
to override that GUC?

======
[1]: /messages/by-id/20231027.115759.2206827438943188717.horikyota.ntt@gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

#101Michael Paquier
michael@paquier.xyz
In reply to: Peter Smith (#100)
Re: pg_upgrade and logical replication

On Thu, Nov 02, 2023 at 04:35:26PM +1100, Peter Smith wrote:

The chance of being able to do so should be small as pg_upgrade uses its
own port and unix domain directory (customizable as well with
--socketdir), but just preventing the launcher to start is safer at the
end, because we are then sure that no changes would ever be applied.
~
"safer at the end" (??)

Well, just safer.

2a.
The comment is one big long sentence. IMO it will be better to break it up.
2b.
Add a blank line between this comment note and the previous one.

Yes, I found that equally confusing when looking at this patch, so
I've edited the patch this way when I was looking at it today. This
is enough to do the job, so I have applied it for now, before moving
on with the second one of this thread.

2c.
In a recent similar thread [1], they chose to implement a guc_hook to
prevent a user from overriding this via the command line option during
the upgrade. Shouldn't this patch do the same thing, for consistency?
2d.
If you do implement such a guc_hook (per #2c above), then should the
patch also include a test case for getting an ERROR if the user tries
to override that GUC?

Yeah, that may be something to do, but I am not sure that it is worth
complicating the backend code for the remote case where one enforces
an option while we are already setting a GUC in the upgrade path:
/messages/by-id/CAA4eK1Lh9J5VLypSQugkdD+H=_5-6p3rOocjo7JbTogcxA2hxg@mail.gmail.com

That feels like a lot of extra facility for cases that should never
happen.
--
Michael

#102Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#97)
Re: pg_upgrade and logical replication

On Wed, Nov 1, 2023 at 8:33 AM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Oct 27, 2023 at 05:05:39PM +0530, Amit Kapila wrote:

I was analyzing this part and it seems it could be tricky to upgrade
in FINISHEDCOPY state. Because the system would expect that subscriber
would know the old slotname from oldcluster which it can drop at
SYNCDONE state. Now, as sync_slot_name is generated based on subid,
relid which could be different in the new cluster, the generated
slotname would be different after the upgrade. OTOH, if the relstate
is INIT, then I think the sync could be performed even after the
upgrade.

TBH, I am really wondering if there is any need to go down to being
able to handle anything else than READY for the relation states in
pg_subscription_rel. One reason is that it makes it much easier to
think about how to handle these in parallel of a node with
publications that also need to go through an upgrade, because as READY
relations they don't require any tracking. IMO, this makes it simpler
to think about cases where a node holds both subscriptions and
publications.

But that poses needless restrictions for the users. For example, there
appears no harm in upgrading even when the relation is in
SUBREL_STATE_INIT state. Users should be able to continue replication
after the upgrade.

FWIW, my take is that it feels natural to do the upgrades of
subscriptions first, creating a similarity with the case of minor
updates with physical replication setups.

Shouldn't we at least ensure that replication origins do exist in the
old cluster corresponding to each of the subscriptions? Otherwise,
later the query to get remote_lsn for origin in getSubscriptions()
would fail.

You mean in the shape of a pre-upgrade check making sure that
pg_replication_origin_status has entries for all the subscriptions we
expect to see during the upgrade?

Yes.

--
With Regards,
Amit Kapila.

#103vignesh C
vignesh21@gmail.com
In reply to: Michael Paquier (#98)
1 attachment(s)
Re: pg_upgrade and logical replication

On Wed, 1 Nov 2023 at 10:13, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Oct 30, 2023 at 03:05:09PM +0530, vignesh C wrote:

The patch was not applying because of recent commits. Here is a
rebased version of the patches.

+     * We don't want the launcher to run while upgrading because it may start
+     * apply workers which could start receiving changes from the publisher
+     * before the physical files are put in place, causing corruption on the
+     * new cluster upgrading to, so setting max_logical_replication_workers=0
+     * to disable launcher.
*/
if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
-        appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
+        appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1 -c max_logical_replication_workers=0");

At least that's consistent with the other side of the coin with
publications. So 0001 looks basically OK seen from here.

The indentation of 0002 seems off in a few places.

I fixed wherever possible for documentation and also ran pgindent and
pgperltidy.

+    <para>
+     Verify that all the subscription tables in the old subscriber are in
+     <literal>r</literal> (ready) state. Setup the
+     <link linkend="logical-replication-config-subscriber"> subscriber
+     configurations</link> in the new subscriber.
[...]
+    <para>
+     There is a prerequisites that all the subscription tables should be in
+     <literal>r</literal> (ready) state for
+     <application>pg_upgrade</application> to be able to upgrade the
+     subscriber. If this is not met an error will be reported.
+    </para>

This part is repeated.

Removed the duplicate contents.

Globally, this documentation addition does not
seem really helpful for the end-user as it describes the checks that
are done during the upgrade. Shouldn't this part of the docs,
similarly to the publication part, focus on providing a check list of
actions to take to achieve a clean upgrade, with a list of commands
and configurations required? The good part is that information about
what's copied is provided (pg_subscription_rel and the origin status),
still this could be improved.

I have slightly modified it now and also made it consistent with the
replication slot upgrade, but I was not sure if we need to add
anything more. Let me know if anything else needs to be added. I will
add it.

+    <para>
+     Enable the subscriptions by executing
+     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+    </para>

This is something users can act on, but how does this operation help
with the upgrade? Should this happen for all the descriptions
subscriptions? Or you mean that this is something that needs to be
run after the upgrade?

The subscriptions will be upgraded in disabled mode. Users must enable
the subscriptions after the upgrade is completed. I have mentioned the
same to avoid confusion.

+    <para>
+     Create all the new tables that were created in the publication and
+     refresh the publication by executing
+     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+    </para>

What does "new tables" refer to in this case? Are you referring to
the case where new relations have been added on a publication node
after an upgrade and need to be copied? Does one need to DISABLE the
subscriptions on the subscriber node before running the upgrade, or is
a REFRESH enough? The test only uses a REFRESH, so the docs and the
code don't entirely agree with each other.

Yes, "new tables" refers to the new tables created in the publisher
when the upgrade is in progress. No need to disable the subscription
before upgrade, during upgrade the subscriptions will be copied in
disabled mode, they should be enabled after the upgrade. Mentioned all
these accordingly.

+  <para>
+   For upgradation of the subscriptions, all the subscription tables should be
+   in <literal>r</literal> (ready) state, or else the
+   <application>pg_upgrade</application> run will error.
+  </para>

"Upgradation"?

I have removed this content since we have added this in the
prerequisite section now.

+# Set tables to 'i' state
+$old_sub->safe_psql(
+       'postgres',
+       "UPDATE pg_subscription_rel
+               SET srsubstate = 'i' WHERE srsubstate = 'r'");

I am not sure that doing catalog manipulation in the TAP test itself
is a good idea, because this can finish by being unpredictible in the
long-term for the test maintenance. I think that this portion of the
test should just be removed. poll_query_until() or wait queries
making sure that all the relations are in the state we want them to be
before the beginning of the upgrade is enough in terms of test
coverag, IMO.

Changed the scenario by using primary key failure.

+$result = $new_sub->safe_psql('postgres',
+       "SELECT remote_lsn FROM pg_replication_origin_status");

This assumes one row, but perhaps this had better do a match based on
external_id and/or local_id?

Modified

The attached v11 version patch has the changes for the same.

Regards,
Vignesh

Attachments:

v11-0001-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v11-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From 768cba6d701913660b7e930d68e8b5497aae499f Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v11] Preserve the full subscription's state during pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_create_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_create_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_create_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin are needed to ensure
that we don't replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init), 's' (data sync) or in 'r' (ready) state, and
will error out if that's not the case, logging the reason for the failure.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  67 +++++++
 src/backend/catalog/pg_subscription.c      |   2 +
 src/backend/utils/adt/pg_upgrade_support.c | 126 ++++++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 197 +++++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 105 ++++++++++
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 222 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/tools/pgindent/typedefs.list           |   1 +
 12 files changed, 776 insertions(+), 4 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 46e8a0b746..eba88e18aa 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,73 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription tables information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system table and the subscription replication origin which
+     will help in continuing logical replication from where the old subscriber
+     was replicating. This helps in avoiding the need for setting up the
+     subscription objects manually which requires truncating all the
+     subscription tables and setting the logical replication slots. Migration
+     of subscriber dependencies is only supported when the old cluster is
+     version 17.0 or later. Subscriber dependencies on clusters before version
+     17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in
+       <literal>i</literal> (initialize), <literal>r</literal> (ready) or
+       <literal>s</literal> (synchronized). This can be verified by checking
+       <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+    </itemizedlist>
+
+    <para>
+     The subscriptions will be migrated to new cluster in disabled state, they
+     can be enabled after upgrade by following the steps:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Create all the new tables that were created in the publication during
+       upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d6a978f136..492b34ff12 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -25,6 +25,8 @@
 #include "catalog/pg_type.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..e8b12adb3c 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,21 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +311,123 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_create_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * table.
+ */
+Datum
+binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_create_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+
+	if (PG_ARGISNULL(3))
+		sublsn = InvalidXLogRecPtr;
+	else
+		sublsn = PG_GETARG_LSN(3);
+
+	if (!OidIsValid(relid))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid relation identifier used: %u", relid));
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	sublsn;
+	char		originname[NAMEDATALEN];
+	RepOriginId originid;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+	if (PG_ARGISNULL(1))
+		sublsn = InvalidXLogRecPtr;
+	else
+		sublsn = PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+	originid = replorigin_by_name(originname, false);
+	replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e863913849..a81d1384a4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4581,6 +4582,99 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  get information about subscription membership for dumpable tables, this
+ *    will be used only in binary-upgrade mode.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i;
+	int			cur_rel = 0;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get subscription relation fields */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[cur_rel].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[cur_rel].dobj.catId.tableoid = relid;
+		subrinfo[cur_rel].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[cur_rel].dobj);
+		subrinfo[cur_rel].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[cur_rel].tblinfo = tblinfo;
+		subrinfo[cur_rel].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[cur_rel].srsublsn = NULL;
+		else
+			subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[cur_rel].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[cur_rel].dobj), fout);
+
+		cur_rel++;
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4607,6 +4701,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4662,16 +4757,19 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	appendPQExpBufferStr(query, "o.remote_lsn\n");
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n"
+						 "LEFT JOIN pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4698,6 +4796,7 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "remote_lsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4735,6 +4834,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4744,6 +4848,80 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  dump the definition of the given subscription table mapping, this will be
+ *    used only for upgrade operation.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_create_sub_rel_state will add the subscription
+		 * relation to pg_subscripion_rel table, this is supported only for
+		 * upgrade operation.
+		 */
+		if (fout->remoteVersion >= 170000)
+		{
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_create_sub_rel_state(");
+			appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+			appendPQExpBuffer(query,
+							  ", %u, '%c'",
+							  subrinfo->tblinfo->dobj.catId.oid,
+							  subrinfo->srsubstate);
+
+			if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", '%s'",
+								  subrinfo->srsublsn);
+			else
+				appendPQExpBuffer(query, ", NULL");
+
+			appendPQExpBufferStr(query, ");\n");
+		}
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4824,6 +5002,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10442,6 +10631,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18508,6 +18700,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..62b3d9249b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -671,8 +672,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +711,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +771,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..4a4b91224d 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d)",
+					 obj->dumpId);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..fda4aeed24 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_state(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -112,6 +113,8 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
+	check_for_subscription_state(&old_cluster);
+
 	/*
 	 * Logical replication slots can be migrated since PG17. See comments atop
 	 * get_old_cluster_logical_slot_infos().
@@ -812,6 +815,108 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_state()
+ *
+ * Verify that each of the subscriptions have all their corresponding tables in
+ * ready state.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	/* Subscription relations state can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subscription_state.txt");
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:%s subscription:%s\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, c.relname, n.nspname "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE srsubstate NOT IN ('i', 's', 'r') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "database:%s subscription:%s schema:%s relation:%s in non-ready state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscription(s) with\n"
+				 "Subscription not having origin and/or subscription relation(s) not in ready state.\n"
+				 "A list of subscription not having origin and/or\n"
+				 "subscription relation(s) not in ready state is in the file: %s",
+				 output_path);
+	}
+	else
+		check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..b56d2ec574
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,222 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use Cwd qw(abs_path);
+use File::Basename qw(dirname);
+use File::Compare;
+use File::Find qw(find);
+use File::Path qw(rmtree);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::AdjustUpgrade;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $bindir = $new_sub->config_data('--bindir');
+
+sub insert_line
+{
+	my $payload = shift;
+
+	foreach ("t1", "t2")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("t1", "t2")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line('before initial sync');
+
+# Setup logical replication, replicating only 1 table
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub FOR TABLE t1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION regress_pub"
+);
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+# ------------------------------------------------------
+# Check that pg_upgrade is succesful when all tables are in ready state.
+# ------------------------------------------------------
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r');";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with invalid remote_lsn');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Check the number of rows for each table on each server
+my $result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on the old subscriber");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0), "check initial t2 table data on the old subscriber");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if there's a subscription with tables in
+# a state different than 'r' (ready), 'i' (init) and 's' (synchronized).
+# ------------------------------------------------------
+
+$publisher->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial, val text);");
+$old_sub->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+# Add a row in subscriber so that the table sync will fail.
+$old_sub->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_primary_key");
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with incorrect sub rel');
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# ------------------------------------------------------
+# Check that pg_upgrade doesn't detect any problem once all the subscription's
+# relation are in 'r' (ready) state.
+# ------------------------------------------------------
+
+# Delete the table data so that the primary key violation error will not happen
+# and tab_primary_key reaches ready state.
+$old_sub->safe_psql('postgres', "DELETE FROM tab_primary_key");
+
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# ------------------------------------------------------
+# The incremental changes added to the publisher are replicated after upgrade.
+# ------------------------------------------------------
+
+# Stop the old subscriber, insert a row in each table while it's down and add
+# t2 to the publication
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+insert_line('while old_sub is down');
+
+# Run pg_upgrade
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+	],
+	'run of pg_upgrade for new sub');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade success");
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE t2");
+
+$new_sub->start;
+
+# Subscription relations and replication origin remote_lsn should be preserved
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2), "There should be 2 rows in pg_subscription_rel");
+
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s where os.external_id = 'pg_' || s.oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# There should be no new replicated rows before enabling the subscription
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1),
+	"t1 table has no new replicated rows before enabling the subscription");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in t2 table which is not part of the publication");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub ENABLE");
+
+$publisher->wait_for_catchup('regress_sub');
+
+# Rows on t1 should have been replicated, while nothing should happen for t2
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2), "check replicated inserts on new subscriber");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in table t2 afer enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, only the missing row on t2 should be replicated
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2),
+	"check there is no change when there was no changes replicated");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 568aa80d92..40ad5f2fb9 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11379,6 +11379,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_create_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_create_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 87c1aee379..90b321945c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2656,6 +2656,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#104Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#103)
Re: pg_upgrade and logical replication

On Thu, Nov 2, 2023 at 3:41 PM vignesh C <vignesh21@gmail.com> wrote:

I have slightly modified it now and also made it consistent with the
replication slot upgrade, but I was not sure if we need to add
anything more. Let me know if anything else needs to be added. I will
add it.

I think it is important for users to know how they upgrade their
multi-node setup. Say a two-node setup where replication is working
both ways (aka each node has both publications and subscriptions),
similarly, how to upgrade, if there are multiple nodes involved?

One more thing I was thinking about this patch was that here unlike
the publication's slot information, we can't ensure with origin's
remote_lsn that all the WAL is received and applied before allowing
the upgrade. I can't think of any problem at the moment due to this
but still a point worth giving a thought.

--
With Regards,
Amit Kapila.

#105vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#100)
1 attachment(s)
Re: pg_upgrade and logical replication

On Thu, 2 Nov 2023 at 11:05, Peter Smith <smithpb2250@gmail.com> wrote:

~~~

2c.
In a recent similar thread [1], they chose to implement a guc_hook to
prevent a user from overriding this via the command line option during
the upgrade. Shouldn't this patch do the same thing, for consistency?

Added GUC hook for consistency.

~~~

2d.
If you do implement such a guc_hook (per #2c above), then should the
patch also include a test case for getting an ERROR if the user tries
to override that GUC?

Added a test for the same.

We can use this patch if we are planning to go ahead with guc_hooks
for max_slot_wal_keep_size as discussed at [1]/messages/by-id/CAHut+PsTrB=mjBA-Y-+W4kK63tao9=XBsMXG9rkw4g_m9WatwA@mail.gmail.com.
The attached patch has the changes for the same.

[1]: /messages/by-id/CAHut+PsTrB=mjBA-Y-+W4kK63tao9=XBsMXG9rkw4g_m9WatwA@mail.gmail.com

Regards,
Vignesh

Attachments:

0001-Added-GUC-hook-for-max_logical_replication_workers.patchtext/x-patch; charset=US-ASCII; name=0001-Added-GUC-hook-for-max_logical_replication_workers.patchDownload
From 00050247bad78b331dc1f841296dd40b3f37ecaf Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Fri, 3 Nov 2023 14:57:48 +0530
Subject: [PATCH] Added GUC hook for max_logical_replication_workers.

During a binary upgrade, pg_upgrade sets this variable to 0 via the command
line in an attempt to prevent startup of the logical replication launcher
which may start apply workers that could start receiving changes from the
publisher before the physical files are put in place, causing corruption on
the new cluster upgrading to, but users have ways to override it. Added
GUC hook to prevent overriding of max_logical_replication_workers
configuration.
---
 src/backend/utils/init/postinit.c      | 24 +++++++++++++++++
 src/backend/utils/misc/guc_tables.c    |  2 +-
 src/bin/pg_upgrade/t/002_pg_upgrade.pl | 36 ++++++++++++++++++++++++++
 src/include/utils/guc_hooks.h          |  2 ++
 4 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 552cf9d950..22e37d34e7 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -617,6 +617,30 @@ check_max_wal_senders(int *newval, void **extra, GucSource source)
 	return true;
 }
 
+/*
+ * GUC check_hook for max_logical_replication_workers
+ *
+ * During a binary upgrade, pg_upgrade sets this variable to 0 via the command
+ * line in an attempt to prevent startup of the logical replication launcher
+ * which may start apply workers that could start receiving changes from the
+ * publisher before the physical files are put in place, causing corruption on
+ * the new cluster upgrading to, but users have ways to override it. To ensure
+ * the successful completion of the upgrade, it's essential to keep this
+ * variable unaltered.  See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_max_logical_replication_workers(int *newval, void **extra,
+									  GucSource source)
+{
+	if (IsBinaryUpgrade && *newval)
+	{
+		GUC_check_errdetail("\"%s\" must be set to 0 during binary upgrade mode.",
+							"max_logical_replication_workers");
+		return false;
+	}
+	return true;
+}
+
 /*
  * Early initialization of a backend (either standalone or under postmaster).
  * This happens even before InitPostgres.
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 7605eff9b9..4a5ff3d317 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3049,7 +3049,7 @@ struct config_int ConfigureNamesInt[] =
 		},
 		&max_logical_replication_workers,
 		4, 0, MAX_BACKENDS,
-		NULL, NULL, NULL
+		check_max_logical_replication_workers, NULL, NULL
 	},
 
 	{
diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
index c6d83d3c21..a6ca422c58 100644
--- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl
+++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
@@ -371,6 +371,42 @@ $oldnode->start;
 $oldnode->safe_psql('postgres', 'DROP DATABASE regression_invalid');
 $oldnode->stop;
 
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $oldnode->data_dir,
+		'-D',         $newnode->data_dir, '-b', $oldbindir,
+		'-B',         $newbindir,         '-s', $newnode->host,
+		'-p',         $oldnode->port,     '-P', $newnode->port,
+		'-O -c max_logical_replication_workers=1',
+		$mode,        '--check',
+	],
+	'run of pg_upgrade with invalid max_logical_replication_workers');
+ok(-d $newnode->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ not removed after pg_upgrade failure");
+# Verify the reason why the logical replication slot cannot be upgraded
+my $upgrade_logfile;
+
+# Find pg_upgrade_server text file. We cannot predict the file's path because
+# the output directory contains a milliseconds timestamp.
+# File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/pg_upgrade_server\.log/)
+		{
+			$upgrade_logfile = $File::Find::name;
+		}
+	},
+	$newnode->data_dir . "/pg_upgrade_output.d");
+
+# Check that the server has logged a message saying server start failed with
+# invalid max_logical_replication_workers error.
+like(
+	slurp_file($upgrade_logfile),
+	qr/"max_logical_replication_workers\" must be set to 0 during binary upgrade mode/m,
+	'cannot specify a different value to max_logical_replication_workers');
+
+rmtree($newnode->data_dir . "/pg_upgrade_output.d");
+
 # --check command works here, cleans up pg_upgrade_output.d.
 command_ok(
 	[
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 2a191830a8..dda5651fa8 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -84,6 +84,8 @@ extern bool check_maintenance_io_concurrency(int *newval, void **extra,
 extern void assign_maintenance_io_concurrency(int newval, void *extra);
 extern bool check_max_connections(int *newval, void **extra, GucSource source);
 extern bool check_max_wal_senders(int *newval, void **extra, GucSource source);
+extern bool check_max_logical_replication_workers(int *newval, void **extra,
+												  GucSource source);
 extern void assign_max_wal_size(int newval, void *extra);
 extern bool check_max_worker_processes(int *newval, void **extra,
 									   GucSource source);
-- 
2.34.1

#106Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#104)
Re: pg_upgrade and logical replication

On Thu, Nov 02, 2023 at 05:00:55PM +0530, Amit Kapila wrote:

I think it is important for users to know how they upgrade their
multi-node setup. Say a two-node setup where replication is working
both ways (aka each node has both publications and subscriptions),
similarly, how to upgrade, if there are multiple nodes involved?

+1.  My next remarks also apply to the thread where publishers are
handled in upgrades, but I'd like to think that at the end of the
release cycle it would be nice to have the basic features in, with
also a set of regression tests for logical upgrade scenarios that we'd
expect to work.  Two "basic" ones coming into mind:
- Cascading logical setup, with one node in the middle having both
publisher(s) and subscriber(s).
- Two-way replication, with two nodes.

One more thing I was thinking about this patch was that here unlike
the publication's slot information, we can't ensure with origin's
remote_lsn that all the WAL is received and applied before allowing
the upgrade. I can't think of any problem at the moment due to this
but still a point worth giving a thought.

Yeah, that may be an itchy point, which is also related to my concerns
on trying to allow more syncstates than ready when beginning the
upgrade, which is at least a point we are sure that a relation was up
to date, up to a certain point.
--
Michael

#107Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#103)
Re: pg_upgrade and logical replication

Here are some review comments for patch v11-0001

======
Commit message

1.
The subscription's replication origin are needed to ensure
that we don't replicate anything twice.

~

/are needed/is needed/

~~~

2.
Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: /messages/by-id/20230217075433.u5mjly4d5cr4hcfe@jrouhaud

~

Include Vignesh as another author.

======
doc/src/sgml/ref/pgupgrade.sgml

3.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription tables information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system table and the subscription replication origin which
+     will help in continuing logical replication from where the old subscriber
+     was replicating. This helps in avoiding the need for setting up the

I became a bit lost reading paragraph due to the multiple 'which'...

SUGGESTION
pg_upgrade attempts to migrate subscription dependencies which
includes the subscription table information present in
pg_subscription_rel system
catalog and also the subscription replication origin. This allows
logical replication on the new subscriber to continue from where the
old subscriber was up to.

~~~

4.
+     was replicating. This helps in avoiding the need for setting up the
+     subscription objects manually which requires truncating all the
+     subscription tables and setting the logical replication slots. Migration

SUGGESTION
Having the ability to migrate subscription objects avoids the need to
set them up manually, which would require truncating all the
subscription tables and setting the logical replication slots.

~

TBH, I am wondering what is the purpose of this sentence. It seems
more like a justification for the patch, but does the user need to
know all this?

~~~

5.
+      <para>
+       All the subscription tables in the old subscriber should be in
+       <literal>i</literal> (initialize), <literal>r</literal> (ready) or
+       <literal>s</literal> (synchronized). This can be verified by checking
+       <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>

/should be in/should be in state/

~~~

6.
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>

missing words?

/This can be checking/This can be found by checking/

~~~

7.
+    <para>
+     The subscriptions will be migrated to new cluster in disabled state, they
+     can be enabled after upgrade by following the steps:
+    </para>

The first bullet also says "Enable the subscription..." so I think
this paragraph should be worded like the below.

SUGGESTION
The subscriptions will be migrated to the new cluster in a disabled
state. After migration, do this:

======
src/backend/catalog/pg_subscription.c

8.
 #include "nodes/makefuncs.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "storage/lmgr.h"

Why does this change need to be in the patch when there are no other
code changes in this file?

======
src/backend/utils/adt/pg_upgrade_support.c

9. binary_upgrade_create_sub_rel_state

IMO a better name for this function would be
'binary_upgrade_add_sub_rel_state' (because it delegates to
AddSubscriptionRelState).

Then it would obey the same name pattern as the other function
'binary_upgrade_replorigin_advance' (which delegates to
replorigin_advance).

~~~

10.
+/*
+ * binary_upgrade_create_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * table.
+ */
+Datum
+binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS)
+{
+ Relation rel;
+ HeapTuple tup;
+ Oid subid;
+ Form_pg_subscription form;
+ char    *subname;
+ Oid relid;
+ char relstate;
+ XLogRecPtr sublsn;

10a.
/to pg_subscription_rel table./to pg_subscription_rel catalog./

~

10b.
Maybe it would be helpful if the function argument were documented
up-front in the function-comment, or in the variable declarations.

SUGGESTION
char *subname; /* ARG0 = subscription name */
Oid relid; /* ARG1 = relation Oid */
char relstate; /* ARG2 = subrel state */
XLogRecPtr sublsn; /* ARG3 (optional) = subscription lsn */

~~~

11.
if (PG_ARGISNULL(3))
sublsn = InvalidXLogRecPtr;
else
sublsn = PG_GETARG_LSN(3);
FWIW, I'd write that as a one-line ternary assignment allowing all the
args to be grouped nicely together.

SUGGESTION
sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);

~~~

12. binary_upgrade_replorigin_advance

/*
* binary_upgrade_replorigin_advance
*
* Update the remote_lsn for the subscriber's replication origin.
*/
Datum
binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
{
Relation rel;
HeapTuple tup;
Oid subid;
Form_pg_subscription form;
char *subname;
XLogRecPtr sublsn;
char originname[NAMEDATALEN];
RepOriginId originid;
~

Similar to previous comment #10b. Maybe it would be helpful if the
function argument were documented up-front in the function-comment, or
in the variable declarations.

SUGGESTION
char originname[NAMEDATALEN];
RepOriginId originid;
char *subname; /* ARG0 = subscription name */
XLogRecPtr sublsn; /* ARG1 = subscription lsn */

~~~

13.
+ subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+ if (PG_ARGISNULL(1))
+ sublsn = InvalidXLogRecPtr;
+ else
+ sublsn = PG_GETARG_LSN(1);

Similar to previous comment #11. FWIW, I'd write that as a one-line
ternary assignment allowing all the args to be grouped nicely
together.

SUGGESTION
subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
sublsn = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);

======
src/bin/pg_dump/pg_dump.c

14. getSubscriptionTables

+/*
+ * getSubscriptionTables
+ *   get information about subscription membership for dumpable tables, this
+ *    will be used only in binary-upgrade mode.
+ */

Should use multiple sentences.

SUGGESTION
Get information about subscription membership for dumpable tables.
This will be used only in binary-upgrade mode.

~~~

15.
+ /* Get subscription relation fields */
+ i_srsubid = PQfnumber(res, "srsubid");
+ i_srrelid = PQfnumber(res, "srrelid");
+ i_srsubstate = PQfnumber(res, "srsubstate");
+ i_srsublsn = PQfnumber(res, "srsublsn");

Might it be better to say "Get pg_subscription_rel attributes"?

~~~

16. getSubscriptions

+ appendPQExpBufferStr(query, "o.remote_lsn\n");
  appendPQExpBufferStr(query,
  "FROM pg_subscription s\n"
+ "LEFT JOIN pg_replication_origin_status o \n"
+ "    ON o.external_id = 'pg_' || s.oid::text \n"
  "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
  "                   WHERE datname = current_database())");

~

16a.
Should that "remote_lsn" have an alias like "suboriginremotelsn" so
that it matches the later field assignment better?

~

16b.
Probably these catalogs should be qualified using "pg_catalog.".

~~~

17. dumpSubscriptionTable

+/*
+ * dumpSubscriptionTable
+ *   dump the definition of the given subscription table mapping, this will be
+ *    used only for upgrade operation.
+ */

Make this comment consistent with the other one for getSubscriptionTables:
- split into multiple sentences
- use the same terminology "binary-upgrade mode" versus "upgrade operation'.

~~~

18.
+ /*
+ * binary_upgrade_create_sub_rel_state will add the subscription
+ * relation to pg_subscripion_rel table, this is supported only for
+ * upgrade operation.
+ */

Split into multiple sentences.

======
src/bin/pg_dump/pg_dump_sort.c

19.
+ case DO_SUBSCRIPTION_REL:
+ snprintf(buf, bufsize,
+ "SUBSCRIPTION TABLE (ID %d)",
+ obj->dumpId);
+ return;

Should it include the OID (like for DO PUBLICATION_TABLE)?

======
src/bin/pg_upgrade/check.c

20.
check_for_reg_data_type_usage(&old_cluster);
check_for_isn_and_int8_passing_mismatch(&old_cluster);

+ check_for_subscription_state(&old_cluster);
+

There seems no reason anymore for this check to be separated from all
the other checks. Just remove the blank line.

~~~

21. check_for_subscription_state

+/*
+ * check_for_subscription_state()
+ *
+ * Verify that each of the subscriptions have all their corresponding tables in
+ * ready state.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)

/have/has/

This comment only refers to 'ready' state, but perhaps it is
misleading (or not entirely correct) because later the SQL is testing
for more than just the READY state:

+ "WHERE srsubstate NOT IN ('i', 's', 'r') "

~~~

22.
+ res = executeQueryOrDie(conn,
+ "SELECT s.subname, c.relname, n.nspname "
+ "FROM pg_catalog.pg_subscription_rel r "
+ "LEFT JOIN pg_catalog.pg_subscription s"
+ " ON r.srsubid = s.oid "
+ "LEFT JOIN pg_catalog.pg_class c"
+ " ON r.srrelid = c.oid "
+ "LEFT JOIN pg_catalog.pg_namespace n"
+ " ON c.relnamespace = n.oid "
+ "WHERE srsubstate NOT IN ('i', 's', 'r') "
+ "ORDER BY s.subname");

If you are going to check 'i', 's', and 'r' then I thought this
statement should maybe have some comment about why those states.

~~~

23.
+ pg_fatal("Your installation contains subscription(s) with\n"
+ "Subscription not having origin and/or subscription relation(s) not
in ready state.\n"
+ "A list of subscription not having origin and/or\n"
+ "subscription relation(s) not in ready state is in the file: %s",
+ output_path);

23a.
This message seems to just be saying the same thing 2 times.

Is also should use newlines and spaces more like the other similar
pg_patals in this file (e.g. the %s is on next line etc).

SUGGESTION
Your installation contains subscriptions without origin or having
relations not in a ready state.\n
A list of the problem subscriptions is in the file:\n
%s

~

23b.
Same question about 'not in ready state'. Is that entirely correct?

======
src/bin/pg_upgrade/t/004_subscription.pl

24.
+sub insert_line
+{
+ my $payload = shift;
+
+ foreach ("t1", "t2")
+ {
+ $publisher->safe_psql('postgres',
+ "INSERT INTO " . $_ . " (val) VALUES('$payload')");
+ }
+}

For clarity, maybe call this function 'insert_line_at_pub'

~~~

25.
+# ------------------------------------------------------
+# Check that pg_upgrade is succesful when all tables are in ready state.
+# ------------------------------------------------------

/succesful/successful/

~~~

26.
+command_ok(
+ [
+ 'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+ '-D',         $new_sub->data_dir, '-b', $bindir,
+ '-B',         $bindir,            '-s', $new_sub->host,
+ '-p',         $old_sub->port,     '-P', $new_sub->port,
+ $mode,        '--check',
+ ],
+ 'run of pg_upgrade --check for old instance with invalid remote_lsn');

This is the command for the "success" case. Why is the message part
referring to "invalid remote_lsn"?

~~~

27.
+$publisher->safe_psql('postgres',
+ "CREATE TABLE tab_primary_key(id serial, val text);");
+$old_sub->safe_psql('postgres',
+ "CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$publisher->safe_psql('postgres',

Maybe it is not necessary, but won't it be better if the publisher
table also has a primary key (so DDL matches its table name)?

~~~

28.
+# Add a row in subscriber so that the table sync will fail.
+$old_sub->safe_psql('postgres',
+ "INSERT INTO tab_primary_key values(1, 'before initial sync')");

The comment should be slightly more descriptive by saying the reason
it will fail is that you deliberately inserted the same PK value
again.

~~~

29.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die "Timed out while waiting for subscriber to synchronize data";

Since this cannot synchronize the table data, maybe the message should
be more like "Timed out while waiting for the table state to become
'd' (datasync)"

~~~

30.
+command_fails(
+ [
+ 'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+ '-D',         $new_sub->data_dir, '-b', $bindir,
+ '-B',         $bindir,            '-s', $new_sub->host,
+ '-p',         $old_sub->port,     '-P', $new_sub->port,
+ $mode,        '--check',
+ ],
+ 'run of pg_upgrade --check for old instance with incorrect sub rel');

/with incorrect sub rel/with incorrect sub rel state/ (??)

~~~

31.
+# ------------------------------------------------------
+# Check that pg_upgrade doesn't detect any problem once all the subscription's
+# relation are in 'r' (ready) state.
+# ------------------------------------------------------

31a.
/relation/relations/

~

31b.
Do you think that comment is correct? All you are doing here is
allowing the old_sub to proceed because there is no longer any
conflict -- but isn't that just normal pub/sub behaviour that has
nothing to do with pg_upgrade?

~~~

32.
+# Stop the old subscriber, insert a row in each table while it's down and add
+# t2 to the publication

/in each table/in each publisher table/

Also, it is not each table -- it's only t1 and t2; not tab_primary_key.

~~~

33.
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2), "There should be 2 rows in pg_subscription_rel");

/2 rows in pg_subscription_rel/2 rows in pg_subscription_rel
(representing t1 and tab_primary_key)/

======

34. binary_upgrade_create_sub_rel_state

+{ oid => '8404', descr => 'for use by pg_upgrade (relation for
pg_subscription_rel)',
+  proname => 'binary_upgrade_create_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_create_sub_rel_state' },

As mentioned in a previous review comment #9, I felt this function
should have a different name: binary_upgrade_add_sub_rel_state.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#108vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#107)
1 attachment(s)
Re: pg_upgrade and logical replication

On Mon, 6 Nov 2023 at 07:51, Peter Smith <smithpb2250@gmail.com> wrote:

Here are some review comments for patch v11-0001

======
Commit message

1.
The subscription's replication origin are needed to ensure
that we don't replicate anything twice.

~

/are needed/is needed/

Modified

2.
Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: /messages/by-id/20230217075433.u5mjly4d5cr4hcfe@jrouhaud

~

Include Vignesh as another author.

Modified

======
doc/src/sgml/ref/pgupgrade.sgml

3.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription tables information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system table and the subscription replication origin which
+     will help in continuing logical replication from where the old subscriber
+     was replicating. This helps in avoiding the need for setting up the

I became a bit lost reading paragraph due to the multiple 'which'...

SUGGESTION
pg_upgrade attempts to migrate subscription dependencies which
includes the subscription table information present in
pg_subscription_rel system
catalog and also the subscription replication origin. This allows
logical replication on the new subscriber to continue from where the
old subscriber was up to.

Modified

~~~

4.
+     was replicating. This helps in avoiding the need for setting up the
+     subscription objects manually which requires truncating all the
+     subscription tables and setting the logical replication slots. Migration

SUGGESTION
Having the ability to migrate subscription objects avoids the need to
set them up manually, which would require truncating all the
subscription tables and setting the logical replication slots.

I have removed this

~

TBH, I am wondering what is the purpose of this sentence. It seems
more like a justification for the patch, but does the user need to
know all this?

~~~

5.
+      <para>
+       All the subscription tables in the old subscriber should be in
+       <literal>i</literal> (initialize), <literal>r</literal> (ready) or
+       <literal>s</literal> (synchronized). This can be verified by checking
+       <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>

/should be in/should be in state/

Modified

~~~

6.
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>

missing words?

/This can be checking/This can be found by checking/

Modified

~~~

7.
+    <para>
+     The subscriptions will be migrated to new cluster in disabled state, they
+     can be enabled after upgrade by following the steps:
+    </para>

The first bullet also says "Enable the subscription..." so I think
this paragraph should be worded like the below.

SUGGESTION
The subscriptions will be migrated to the new cluster in a disabled
state. After migration, do this:

Modified

======
src/backend/catalog/pg_subscription.c

8.
#include "nodes/makefuncs.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
#include "storage/lmgr.h"

Why does this change need to be in the patch when there are no other
code changes in this file?

Modified

======
src/backend/utils/adt/pg_upgrade_support.c

9. binary_upgrade_create_sub_rel_state

IMO a better name for this function would be
'binary_upgrade_add_sub_rel_state' (because it delegates to
AddSubscriptionRelState).

Then it would obey the same name pattern as the other function
'binary_upgrade_replorigin_advance' (which delegates to
replorigin_advance).

Modified

~~~

10.
+/*
+ * binary_upgrade_create_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * table.
+ */
+Datum
+binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS)
+{
+ Relation rel;
+ HeapTuple tup;
+ Oid subid;
+ Form_pg_subscription form;
+ char    *subname;
+ Oid relid;
+ char relstate;
+ XLogRecPtr sublsn;

10a.
/to pg_subscription_rel table./to pg_subscription_rel catalog./

Modified

~

10b.
Maybe it would be helpful if the function argument were documented
up-front in the function-comment, or in the variable declarations.

SUGGESTION
char *subname; /* ARG0 = subscription name */
Oid relid; /* ARG1 = relation Oid */
char relstate; /* ARG2 = subrel state */
XLogRecPtr sublsn; /* ARG3 (optional) = subscription lsn */

I felt the variables are self explainatory in this case and also
consistent with other functions.

~~~

11.
if (PG_ARGISNULL(3))
sublsn = InvalidXLogRecPtr;
else
sublsn = PG_GETARG_LSN(3);
FWIW, I'd write that as a one-line ternary assignment allowing all the
args to be grouped nicely together.

SUGGESTION
sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);

Modified

~~~

12. binary_upgrade_replorigin_advance

/*
* binary_upgrade_replorigin_advance
*
* Update the remote_lsn for the subscriber's replication origin.
*/
Datum
binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
{
Relation rel;
HeapTuple tup;
Oid subid;
Form_pg_subscription form;
char *subname;
XLogRecPtr sublsn;
char originname[NAMEDATALEN];
RepOriginId originid;
~

Similar to previous comment #10b. Maybe it would be helpful if the
function argument were documented up-front in the function-comment, or
in the variable declarations.

SUGGESTION
char originname[NAMEDATALEN];
RepOriginId originid;
char *subname; /* ARG0 = subscription name */
XLogRecPtr sublsn; /* ARG1 = subscription lsn */

I felt the variables are self explainatory in this case and also
consistent with other functions.

~~~

13.
+ subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+ if (PG_ARGISNULL(1))
+ sublsn = InvalidXLogRecPtr;
+ else
+ sublsn = PG_GETARG_LSN(1);

Similar to previous comment #11. FWIW, I'd write that as a one-line
ternary assignment allowing all the args to be grouped nicely
together.

SUGGESTION
subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
sublsn = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);

Modified

======
src/bin/pg_dump/pg_dump.c

14. getSubscriptionTables

+/*
+ * getSubscriptionTables
+ *   get information about subscription membership for dumpable tables, this
+ *    will be used only in binary-upgrade mode.
+ */

Should use multiple sentences.

SUGGESTION
Get information about subscription membership for dumpable tables.
This will be used only in binary-upgrade mode.

Modified

~~~

15.
+ /* Get subscription relation fields */
+ i_srsubid = PQfnumber(res, "srsubid");
+ i_srrelid = PQfnumber(res, "srrelid");
+ i_srsubstate = PQfnumber(res, "srsubstate");
+ i_srsublsn = PQfnumber(res, "srsublsn");

Might it be better to say "Get pg_subscription_rel attributes"?

Modified

~~~

16. getSubscriptions

+ appendPQExpBufferStr(query, "o.remote_lsn\n");
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
+ "LEFT JOIN pg_replication_origin_status o \n"
+ "    ON o.external_id = 'pg_' || s.oid::text \n"
"WHERE s.subdbid = (SELECT oid FROM pg_database\n"
"                   WHERE datname = current_database())");

~

16a.
Should that "remote_lsn" have an alias like "suboriginremotelsn" so
that it matches the later field assignment better?

Modified

~

16b.
Probably these catalogs should be qualified using "pg_catalog.".

Modified

~~~

17. dumpSubscriptionTable

+/*
+ * dumpSubscriptionTable
+ *   dump the definition of the given subscription table mapping, this will be
+ *    used only for upgrade operation.
+ */

Make this comment consistent with the other one for getSubscriptionTables:
- split into multiple sentences
- use the same terminology "binary-upgrade mode" versus "upgrade operation'.

Modified

~~~

18.
+ /*
+ * binary_upgrade_create_sub_rel_state will add the subscription
+ * relation to pg_subscripion_rel table, this is supported only for
+ * upgrade operation.
+ */

Split into multiple sentences.

Modified

======
src/bin/pg_dump/pg_dump_sort.c

19.
+ case DO_SUBSCRIPTION_REL:
+ snprintf(buf, bufsize,
+ "SUBSCRIPTION TABLE (ID %d)",
+ obj->dumpId);
+ return;

Should it include the OID (like for DO PUBLICATION_TABLE)?

Modified

======
src/bin/pg_upgrade/check.c

20.
check_for_reg_data_type_usage(&old_cluster);
check_for_isn_and_int8_passing_mismatch(&old_cluster);

+ check_for_subscription_state(&old_cluster);
+

There seems no reason anymore for this check to be separated from all
the other checks. Just remove the blank line.

Modified

~~~

21. check_for_subscription_state

+/*
+ * check_for_subscription_state()
+ *
+ * Verify that each of the subscriptions have all their corresponding tables in
+ * ready state.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)

/have/has/

This comment only refers to 'ready' state, but perhaps it is
misleading (or not entirely correct) because later the SQL is testing
for more than just the READY state:

+ "WHERE srsubstate NOT IN ('i', 's', 'r') "

Modified

~~~

22.
+ res = executeQueryOrDie(conn,
+ "SELECT s.subname, c.relname, n.nspname "
+ "FROM pg_catalog.pg_subscription_rel r "
+ "LEFT JOIN pg_catalog.pg_subscription s"
+ " ON r.srsubid = s.oid "
+ "LEFT JOIN pg_catalog.pg_class c"
+ " ON r.srrelid = c.oid "
+ "LEFT JOIN pg_catalog.pg_namespace n"
+ " ON c.relnamespace = n.oid "
+ "WHERE srsubstate NOT IN ('i', 's', 'r') "
+ "ORDER BY s.subname");

If you are going to check 'i', 's', and 'r' then I thought this
statement should maybe have some comment about why those states.

Modified

~~~

23.
+ pg_fatal("Your installation contains subscription(s) with\n"
+ "Subscription not having origin and/or subscription relation(s) not
in ready state.\n"
+ "A list of subscription not having origin and/or\n"
+ "subscription relation(s) not in ready state is in the file: %s",
+ output_path);

23a.
This message seems to just be saying the same thing 2 times.

Is also should use newlines and spaces more like the other similar
pg_patals in this file (e.g. the %s is on next line etc).

SUGGESTION
Your installation contains subscriptions without origin or having
relations not in a ready state.\n
A list of the problem subscriptions is in the file:\n
%s

Modified

~

23b.
Same question about 'not in ready state'. Is that entirely correct?

Modified

======
src/bin/pg_upgrade/t/004_subscription.pl

24.
+sub insert_line
+{
+ my $payload = shift;
+
+ foreach ("t1", "t2")
+ {
+ $publisher->safe_psql('postgres',
+ "INSERT INTO " . $_ . " (val) VALUES('$payload')");
+ }
+}

For clarity, maybe call this function 'insert_line_at_pub'

Modified

~~~

25.
+# ------------------------------------------------------
+# Check that pg_upgrade is succesful when all tables are in ready state.
+# ------------------------------------------------------

/succesful/successful/

Modified

~~~

26.
+command_ok(
+ [
+ 'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+ '-D',         $new_sub->data_dir, '-b', $bindir,
+ '-B',         $bindir,            '-s', $new_sub->host,
+ '-p',         $old_sub->port,     '-P', $new_sub->port,
+ $mode,        '--check',
+ ],
+ 'run of pg_upgrade --check for old instance with invalid remote_lsn');

This is the command for the "success" case. Why is the message part
referring to "invalid remote_lsn"?

Modified

~~~

27.
+$publisher->safe_psql('postgres',
+ "CREATE TABLE tab_primary_key(id serial, val text);");
+$old_sub->safe_psql('postgres',
+ "CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$publisher->safe_psql('postgres',

Maybe it is not necessary, but won't it be better if the publisher
table also has a primary key (so DDL matches its table name)?

Modified

~~~

28.
+# Add a row in subscriber so that the table sync will fail.
+$old_sub->safe_psql('postgres',
+ "INSERT INTO tab_primary_key values(1, 'before initial sync')");

The comment should be slightly more descriptive by saying the reason
it will fail is that you deliberately inserted the same PK value
again.

Modified

~~~

29.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die "Timed out while waiting for subscriber to synchronize data";

Since this cannot synchronize the table data, maybe the message should
be more like "Timed out while waiting for the table state to become
'd' (datasync)"

Modified

~~~

30.
+command_fails(
+ [
+ 'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+ '-D',         $new_sub->data_dir, '-b', $bindir,
+ '-B',         $bindir,            '-s', $new_sub->host,
+ '-p',         $old_sub->port,     '-P', $new_sub->port,
+ $mode,        '--check',
+ ],
+ 'run of pg_upgrade --check for old instance with incorrect sub rel');

/with incorrect sub rel/with incorrect sub rel state/ (??)

Modified

~~~

31.
+# ------------------------------------------------------
+# Check that pg_upgrade doesn't detect any problem once all the subscription's
+# relation are in 'r' (ready) state.
+# ------------------------------------------------------

31a.
/relation/relations/

I have removed this comment

31b.
Do you think that comment is correct? All you are doing here is
allowing the old_sub to proceed because there is no longer any
conflict -- but isn't that just normal pub/sub behaviour that has
nothing to do with pg_upgrade?

I have removed this comment

~~~

32.
+# Stop the old subscriber, insert a row in each table while it's down and add
+# t2 to the publication

/in each table/in each publisher table/

Also, it is not each table -- it's only t1 and t2; not tab_primary_key.

Modified

~~~

33.
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2), "There should be 2 rows in pg_subscription_rel");

/2 rows in pg_subscription_rel/2 rows in pg_subscription_rel
(representing t1 and tab_primary_key)/

Modified

======

34. binary_upgrade_create_sub_rel_state

+{ oid => '8404', descr => 'for use by pg_upgrade (relation for
pg_subscription_rel)',
+  proname => 'binary_upgrade_create_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_create_sub_rel_state' },

As mentioned in a previous review comment #9, I felt this function
should have a different name: binary_upgrade_add_sub_rel_state.

Modified

Thanks for the comments, the attached v12 version patch has the
changes for the same.

Regards,
Vignesh

Attachments:

v12-0001-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v12-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From 50f1a2b3bac25d5bd709bacfc7e71c15e708776f Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v12] Preserve the full subscription's state during pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_create_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_create_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_create_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init), 's' (data sync) or in 'r' (ready) state, and
will error out if that's not the case, logging the reason for the failure.

Author: Julien Rouhaud, Vignesh C
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  64 +++++++
 src/backend/utils/adt/pg_upgrade_support.c | 118 ++++++++++++
 src/bin/pg_dump/common.c                   |  22 +++
 src/bin/pg_dump/pg_dump.c                  | 201 ++++++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 123 ++++++++++++
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 213 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/tools/pgindent/typedefs.list           |   1 +
 11 files changed, 774 insertions(+), 6 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 46e8a0b746..1a6ad16060 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,70 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize), <literal>r</literal> (ready) or
+       <literal>s</literal> (synchronized). This can be verified by checking
+       <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+    </itemizedlist>
+
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Create all the new tables that were created in the publication during
+       upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..4a3da80e49 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,21 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +311,115 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	if (!OidIsValid(relid))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid relation identifier used: %u", relid));
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	sublsn;
+	char		originname[NAMEDATALEN];
+	RepOriginId originid;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	sublsn = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+	originid = replorigin_by_name(originname, false);
+	replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e863913849..4e33493852 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4581,6 +4582,99 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i;
+	int			cur_rel = 0;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[cur_rel].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[cur_rel].dobj.catId.tableoid = relid;
+		subrinfo[cur_rel].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[cur_rel].dobj);
+		subrinfo[cur_rel].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[cur_rel].tblinfo = tblinfo;
+		subrinfo[cur_rel].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[cur_rel].srsublsn = NULL;
+		else
+			subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[cur_rel].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[cur_rel].dobj), fout);
+
+		cur_rel++;
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4607,6 +4701,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4662,17 +4757,20 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	appendPQExpBufferStr(query, "o.remote_lsn AS suboriginremotelsn\n");
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
-						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
+						 "FROM pg_catalog.pg_subscription s\n"
+						 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
+						 "WHERE s.subdbid = (SELECT oid FROM pg_catalog.pg_database\n"
 						 "                   WHERE datname = current_database())");
 
 	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
@@ -4698,6 +4796,7 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4735,6 +4834,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4744,6 +4848,80 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription
+		 * relation to pg_subscripion_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		if (fout->remoteVersion >= 170000)
+		{
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+			appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+			appendPQExpBuffer(query,
+							  ", %u, '%c'",
+							  subrinfo->tblinfo->dobj.catId.oid,
+							  subrinfo->srsubstate);
+
+			if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", '%s'",
+								  subrinfo->srsublsn);
+			else
+				appendPQExpBuffer(query, ", NULL");
+
+			appendPQExpBufferStr(query, ");\n");
+		}
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4824,6 +5002,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10442,6 +10631,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18508,6 +18700,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..62b3d9249b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -671,8 +672,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +711,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +771,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..70d3087e9f 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_state(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -111,6 +112,7 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_composite_data_type_usage(&old_cluster);
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
+	check_for_subscription_state(&old_cluster);
 
 	/*
 	 * Logical replication slots can be migrated since PG17. See comments atop
@@ -812,6 +814,127 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize), r (ready) or s (synchronized) state.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	/* Subscription relations state can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subscription_state.txt");
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:%s subscription:%s\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * The subscription relation should be in either i (initialize),
+		 * r (ready) or s (synchronized) state as either the replication slot
+		 * is not created or the replication slot is already dropped and the
+		 * required WAL files will be present in the publisher. The other
+		 * states are not ok as the worker has dependency on the replication
+		 * slot/origin in these case:
+		 * a) SUBREL_STATE_DATASYNC: In this case, the table sync worker will
+		 * try to drop the replication slot but as the replication slots will
+		 * be created with old subscription id in the publisher and the
+		 * upgraded subscriber will not be able to clean the slots in this
+		 * case.
+		 * b) SUBREL_STATE_FINISHEDCOPY: In this case, the tablesync worker will
+		 * expect the origin to be already existing as the origin is created
+		 * with an old subscription id, tablesync worker will not be able to
+		 * find the origin in this case.
+		 * c) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, c.relname, n.nspname, r.srsubstate "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r', 's') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "database:%s subscription:%s schema:%s relation:%s state:%s not in required state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize), r (ready) or s (synchronized) state.\n"
+				 "A list of problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..ee6029b9b7
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,213 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Path qw(rmtree);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $bindir = $new_sub->config_data('--bindir');
+
+sub insert_line_at_pub
+{
+	my $payload = shift;
+
+	foreach ("t1", "t2")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("t1", "t2")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');
+
+# Setup logical replication, replicating only 1 table
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub FOR TABLE t1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION regress_pub"
+);
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready state.
+# ------------------------------------------------------
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r');";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance when the subscription tables are in ready state');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Check the number of rows for each table on each server
+my $result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on the old subscriber");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0), "check initial t2 table data on the old subscriber");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if there's a subscription with tables in
+# a state different than 'r' (ready), 'i' (init) and 's' (synchronized).
+# ------------------------------------------------------
+
+$publisher->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$old_sub->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_primary_key");
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die "Timed out while waiting for the table state to become 'd' (datasync)";
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state');
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Delete the table data so that the primary key violation error will not happen
+# and tab_primary_key reaches ready state.
+$old_sub->safe_psql('postgres', "DELETE FROM tab_primary_key");
+
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# ------------------------------------------------------
+# The incremental changes added to the publisher are replicated after upgrade.
+# ------------------------------------------------------
+
+# Stop the old subscriber, insert a row in t1 and t2 publisher table while it's
+# down and add t2 to the publication.
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+insert_line_at_pub('while old_sub is down');
+
+# Run pg_upgrade
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+	],
+	'run of pg_upgrade for new sub');
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade success");
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE t2");
+
+$new_sub->start;
+
+# Subscription relations and replication origin remote_lsn should be preserved
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2), "There should be 2 rows in pg_subscription_rel(representing t1 and tab_primary_key)");
+
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s where os.external_id = 'pg_' || s.oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# There should be no new replicated rows before enabling the subscription
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1),
+	"t1 table has no new replicated rows before enabling the subscription");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in t2 table which is not part of the publication");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub ENABLE");
+
+$publisher->wait_for_catchup('regress_sub');
+
+# Rows on t1 should have been replicated, while nothing should happen for t2
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2), "check replicated inserts on new subscriber");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(0),
+	"no change in table t2 afer enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, only the missing row on t2 should be replicated
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(2),
+	"check there is no change when there was no changes replicated");
+$result = $new_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index f14aed422a..c7bf3cbd55 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11382,6 +11382,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 87c1aee379..90b321945c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2656,6 +2656,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#109vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#104)
Re: pg_upgrade and logical replication

On Thu, 2 Nov 2023 at 17:01, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Nov 2, 2023 at 3:41 PM vignesh C <vignesh21@gmail.com> wrote:

I have slightly modified it now and also made it consistent with the
replication slot upgrade, but I was not sure if we need to add
anything more. Let me know if anything else needs to be added. I will
add it.

I think it is important for users to know how they upgrade their
multi-node setup. Say a two-node setup where replication is working
both ways (aka each node has both publications and subscriptions),
similarly, how to upgrade, if there are multiple nodes involved?

I was thinking of documenting something like this:
Steps to upgrade logical replication clusters:
Warning:
Upgrading logical replication nodes requires multiple steps to be
performed. Because not all operations are transactional, the user is
advised to take backups.
Backups can be taken as described in
https://www.postgresql.org/docs/current/backup.html

Upgrading 2 node logical replication cluster:
1) Let's say publisher is in Node1 and subscriber is in Node2.
2) Stop the publisher server in Node1.
3) Disable the subscriptions in Node2.
4) Upgrade the publisher node Node1 to Node1_new.
5) Start the publisher node Node1_new.
6) Stop the subscriber server in Node2.
7) Upgrade the subscriber node Node2 to Node2_new.
8) Start the subscriber node Node2_new.
9) Alter the subscription connections in Node2_new to point from Node1
to Node1_new.
10) Enable the subscriptions in Node2_new.
11) Create any tables that were created in Node1_new between step-5
and now and Refresh the publications.

Steps to upgrade cascaded logical replication clusters:
1) Let's say we have a cascaded logical replication setup
Node1->Node2->Node3. Here Node2 is subscribing to Node1 and Node3 is
subscribing to Node2.
2) Stop the server in Node1.
3) Disable the subscriptions in Node2 and Node3.
4) Upgrade the publisher node Node1 to Node1_new.
5) Start the publisher node Node1_new.
6) Stop the server in Node1.
7) Upgrade the subscriber node Node2 to Node2_new.
8) Start the subscriber node Node2_new.
9) Alter the subscription connections in Node2_new to point from Node1
to Node1_new.
10) Enable the subscriptions in Node2_new.
11) Create any tables that were created in Node1_new between step-5
and now and Refresh the publications.
12) Stop the server in Node3.
13) Upgrade the subscriber node Node3 to Node3_new.
14) Start the subscriber node Node3_new.
15) Alter the subscription connections in Node3_new to point from
Node2 to Node2_new.
16) Enable the subscriptions in Node2_new.
17) Create any tables that were created in Node2_new between step-8
and now and Refresh the publications.

Upgrading 2 node circular logical replication cluster:
1) Let's say we have a circular logical replication setup Node1->Node2
& Node2->Node1. Here Node2 is subscribing to Node1 and Node1 is
subscribing to Node2.
2) Stop the server in Node1.
3) Disable the subscriptions in Node2.
4) Upgrade the node Node1 to Node1_new.
5) Start the node Node1_new.
6) Enable the subscriptions in Node1_new.
7) Wait till all the incremental changes are synchronized.
8) Alter the subscription connections in Node2 to point from Node1 to Node1_new.
9) Create any tables that were created in Node2 between step-2 and now
and Refresh the publications.
10) Stop the server in Node2.
11) Disable the subscriptions in Node1.
12) Upgrade the node Node2 to Node2_new.
13) Start the subscriber node Node2_new.
14) Enable the subscriptions in Node2_new.
15) Alter the subscription connections in Node1 to point from Node2 to
Node2_new.
16) Create any tables that were created in Node1_new between step-10
and now and Refresh the publications.

I have done basic testing with this, I will do further testing and
update it if I find any issues.
Let me know if this idea is ok or we need something different.

Regards,
Vignesh

#110Michael Paquier
michael@paquier.xyz
In reply to: vignesh C (#109)
Re: pg_upgrade and logical replication

On Wed, Nov 08, 2023 at 10:52:29PM +0530, vignesh C wrote:

Upgrading logical replication nodes requires multiple steps to be
performed. Because not all operations are transactional, the user is
advised to take backups.
Backups can be taken as described in
https://www.postgresql.org/docs/current/backup.html

There's a similar risk with --link if the upgrade fails after the new
cluster was started and the files linked began getting modified, so
that's something users would be OK with, I guess.

Upgrading 2 node logical replication cluster:
1) Let's say publisher is in Node1 and subscriber is in Node2.
2) Stop the publisher server in Node1.
3) Disable the subscriptions in Node2.
4) Upgrade the publisher node Node1 to Node1_new.
5) Start the publisher node Node1_new.
6) Stop the subscriber server in Node2.
7) Upgrade the subscriber node Node2 to Node2_new.
8) Start the subscriber node Node2_new.
9) Alter the subscription connections in Node2_new to point from Node1
to Node1_new.

Do they really need to do so in an pg_upgrade flow? The connection
endpoint would be likely the same for transparency, no?

10) Enable the subscriptions in Node2_new.
11) Create any tables that were created in Node1_new between step-5
and now and Refresh the publications.

How about the opposite stance, where an upgrade flow does first the
subscriber and then the publisher? Would this be worth mentioning?
Case 3 touches that as nodes hold both publishers and subscribers.

Steps to upgrade cascaded logical replication clusters:
1) Let's say we have a cascaded logical replication setup
Node1->Node2->Node3. Here Node2 is subscribing to Node1 and Node3 is
subscribing to Node2.
2) Stop the server in Node1.
3) Disable the subscriptions in Node2 and Node3.
4) Upgrade the publisher node Node1 to Node1_new.
5) Start the publisher node Node1_new.
6) Stop the server in Node1.
7) Upgrade the subscriber node Node2 to Node2_new.
8) Start the subscriber node Node2_new.
9) Alter the subscription connections in Node2_new to point from Node1
to Node1_new.

Same here.

10) Enable the subscriptions in Node2_new.
11) Create any tables that were created in Node1_new between step-5
and now and Refresh the publications.
12) Stop the server in Node3.
13) Upgrade the subscriber node Node3 to Node3_new.
14) Start the subscriber node Node3_new.
15) Alter the subscription connections in Node3_new to point from
Node2 to Node2_new.
16) Enable the subscriptions in Node2_new.
17) Create any tables that were created in Node2_new between step-8
and now and Refresh the publications.

Upgrading 2 node circular logical replication cluster:
1) Let's say we have a circular logical replication setup Node1->Node2
& Node2->Node1. Here Node2 is subscribing to Node1 and Node1 is
subscribing to Node2.
2) Stop the server in Node1.
3) Disable the subscriptions in Node2.
4) Upgrade the node Node1 to Node1_new.
5) Start the node Node1_new.
6) Enable the subscriptions in Node1_new.
7) Wait till all the incremental changes are synchronized.
8) Alter the subscription connections in Node2 to point from Node1 to Node1_new.
9) Create any tables that were created in Node2 between step-2 and now
and Refresh the publications.
10) Stop the server in Node2.
11) Disable the subscriptions in Node1.
12) Upgrade the node Node2 to Node2_new.
13) Start the subscriber node Node2_new.
14) Enable the subscriptions in Node2_new.
15) Alter the subscription connections in Node1 to point from Node2 to
Node2_new.
16) Create any tables that were created in Node1_new between step-10
and now and Refresh the publications.

I have done basic testing with this, I will do further testing and
update it if I find any issues.
Let me know if this idea is ok or we need something different.

I have not tested, but having documentation among these lines is good
because it becomes clear what the steps one needs to do are.

Another thing that I doubt is worth mentioning is the schema changes
that may happen. We could just say that the schema should be fixed
while running an upgrade, which is kind of fair to expect in logical
setups for tables replicated anyway?

Do you think that there would be an issue in automating such tests
once support for the upgrade of subscribers is done (hopefully)? The
first scenario may not need extra coverage if we have already
003_logical_slots.pl and a second file to test for the subscriber
part, though.
--
Michael

#111Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#108)
Re: pg_upgrade and logical replication

Thanks for addressing my previous review comments.

I re-checked the latest patch v12-0001 and found the following:

======
Commit message

1.
The new SQL binary_upgrade_create_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_create_sub_rel_state(subname text, relid oid,
state char [,sublsn pg_lsn])

~

Looks like v12 accidentally forgot to update this to the modified
function name 'binary_upgrade_add_sub_rel_state'

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#112Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#109)
Re: pg_upgrade and logical replication

On Wed, Nov 8, 2023 at 10:52 PM vignesh C <vignesh21@gmail.com> wrote:

Upgrading 2 node circular logical replication cluster:
1) Let's say we have a circular logical replication setup Node1->Node2
& Node2->Node1. Here Node2 is subscribing to Node1 and Node1 is
subscribing to Node2.
2) Stop the server in Node1.
3) Disable the subscriptions in Node2.
4) Upgrade the node Node1 to Node1_new.
5) Start the node Node1_new.
6) Enable the subscriptions in Node1_new.
7) Wait till all the incremental changes are synchronized.
8) Alter the subscription connections in Node2 to point from Node1 to Node1_new.
9) Create any tables that were created in Node2 between step-2 and now
and Refresh the publications.

I haven't reviewed all the steps yet but here steps 7 and 9 seem to
require some validation. How can incremental changes be synchronized
till all the new tables are created and synced before step 7?

--
With Regards,
Amit Kapila.

#113Michael Paquier
michael@paquier.xyz
In reply to: Peter Smith (#111)
Re: pg_upgrade and logical replication

On Thu, Nov 09, 2023 at 01:14:05PM +1100, Peter Smith wrote:

Looks like v12 accidentally forgot to update this to the modified
function name 'binary_upgrade_add_sub_rel_state'

This v12 is overall cleaner than its predecessors. Nice to see.

+my $result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on the old subscriber");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t2");

I'd argue that t1 and t2 should have less generic names. t1 is used
to check that the upgrade process works, while t2 is added to the
publication after upgrading the subscriber. Say something like
tab_upgraded or tab_not_upgraded?

+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r');";

Perhaps it would be safer to use a query that checks the number of
relations in 'r' state? This query would return true if
pg_subscription_rel has no tuples.

+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";

Relying on a pkey error to enforce an incorrect state is a good trick.
Nice.

+command_fails(
+    [
+        'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+        '-D',         $new_sub->data_dir, '-b', $bindir,
+        '-B',         $bindir,            '-s', $new_sub->host,
+        '-p',         $old_sub->port,     '-P', $new_sub->port,
+        $mode,        '--check',
+    ],
+    'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state');
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");

Okay by me to not stop the cluster for the --check to shave a few
cycles. It's a bit sad that we don't cross-check the contents of
subscription_state.txt before removing pg_upgrade_output.d. Finding
the file is easy even if the subdir where it is included is not a
constant name. Then it is possible to apply a regexp with the
contents consumed by a slurp_file().

+my $remote_lsn = $old_sub->safe_psql('postgres',
+    "SELECT remote_lsn FROM pg_replication_origin_status");
Perhaps you've not noticed, but this would be 0/0 most of the time.
However the intention is to check after a valid LSN to make sure that
the origin is set, no?

I am wondering whether this should use a bit more data than just one
tuple, say at least two transaction, one of them with a multi-value
INSERT?

+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready state.
+# ------------------------------------------------------
This comment is a bit inconsistent with the state that are accepted,
but why not, at least that's predictible.

+ * relation to pg_subscripion_rel table. This will be used only in

Typo: s/pg_subscripion_rel/pg_subscription_rel/.

This needs some word-smithing to explain the reasons why a state is
not needed:

+        /*
+         * The subscription relation should be in either i (initialize),
+         * r (ready) or s (synchronized) state as either the replication slot
+         * is not created or the replication slot is already dropped and the
+         * required WAL files will be present in the publisher. The other
+         * states are not ok as the worker has dependency on the replication
+         * slot/origin in these case:

A slot not created yet refers to the 'i' state, while 'r' and 's'
refer to a slot created previously but already dropped, right?
Shouldn't this comment tell that rather than mixing the assumptions?

+         * a) SUBREL_STATE_DATASYNC: In this case, the table sync worker will
+         * try to drop the replication slot but as the replication slots will
+         * be created with old subscription id in the publisher and the
+         * upgraded subscriber will not be able to clean the slots in this
+         * case.

Proposal: A relation upgraded while in this state would retain a
replication slot, which could not be dropped by the sync worker
spawned after the upgrade because the subscription ID tracked by the
publisher does not match anymore.

Note: actually, this would be OK if we are able to keep the OIDs of
the subscribers consistent across upgrades? I'm OK to not do nothing
about that in this patch, to keep it simpler. Just asking in passing.

+         * b) SUBREL_STATE_FINISHEDCOPY: In this case, the tablesync worker will
+         * expect the origin to be already existing as the origin is created
+         * with an old subscription id, tablesync worker will not be able to
+         * find the origin in this case.

Proposal: A tablesync worker spawned to work on a relation upgraded
while in this state would expect an origin ID with the OID of the
subscription used before the upgrade, causing it to fail.

+ "A list of problem subscriptions is in the file:\n"

Sounds a bit strange, perhaps use an extra "the", as of "the problem
subscriptions"?

Could it be worth mentioning in the docs that one could also DISABLE
the subscriptions before running the upgrade?

+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.

Hmm. No need to mention pg_replication_origin_status?

If I may ask, how did you check that the given relation states were
OK or not OK? Did you hardcode some wait points in tablesync.c up to
where a state is updated in pg_subscription_rel, then shutdown the
cluster before the upgrade to maintain the catalog in this state?
Finally, after the upgrade, you've cross-checked the dependencies on
the slots and origins to see that the spawned sync workers turned
crazy because of the inconsistencies. Right?
--
Michael

#114vignesh C
vignesh21@gmail.com
In reply to: Michael Paquier (#113)
1 attachment(s)
Re: pg_upgrade and logical replication

On Thu, 9 Nov 2023 at 12:23, Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Nov 09, 2023 at 01:14:05PM +1100, Peter Smith wrote:

Looks like v12 accidentally forgot to update this to the modified
function name 'binary_upgrade_add_sub_rel_state'

This v12 is overall cleaner than its predecessors. Nice to see.

+my $result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on the old subscriber");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t2");

I'd argue that t1 and t2 should have less generic names. t1 is used
to check that the upgrade process works, while t2 is added to the
publication after upgrading the subscriber. Say something like
tab_upgraded or tab_not_upgraded?

Modified

+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r');";

Perhaps it would be safer to use a query that checks the number of
relations in 'r' state? This query would return true if
pg_subscription_rel has no tuples.

Modified

+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";

Relying on a pkey error to enforce an incorrect state is a good trick.
Nice.

That was better way to get data sync state without manually changing
the pg_subscription_rel catalog

+command_fails(
+    [
+        'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+        '-D',         $new_sub->data_dir, '-b', $bindir,
+        '-B',         $bindir,            '-s', $new_sub->host,
+        '-p',         $old_sub->port,     '-P', $new_sub->port,
+        $mode,        '--check',
+    ],
+    'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state');
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");

Okay by me to not stop the cluster for the --check to shave a few
cycles. It's a bit sad that we don't cross-check the contents of
subscription_state.txt before removing pg_upgrade_output.d. Finding
the file is easy even if the subdir where it is included is not a
constant name. Then it is possible to apply a regexp with the
contents consumed by a slurp_file().

Modified

+my $remote_lsn = $old_sub->safe_psql('postgres',
+    "SELECT remote_lsn FROM pg_replication_origin_status");
Perhaps you've not noticed, but this would be 0/0 most of the time.
However the intention is to check after a valid LSN to make sure that
the origin is set, no?

I have added few more inserts to make remote_lsn not be 0/0

I am wondering whether this should use a bit more data than just one
tuple, say at least two transaction, one of them with a multi-value
INSERT?

Added one more multi-insert

+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready state.
+# ------------------------------------------------------
This comment is a bit inconsistent with the state that are accepted,
but why not, at least that's predictible.

The key test validation is mentioned in this style of comment

+ * relation to pg_subscripion_rel table. This will be used only in

Typo: s/pg_subscripion_rel/pg_subscription_rel/.

Modified

This needs some word-smithing to explain the reasons why a state is
not needed:

+        /*
+         * The subscription relation should be in either i (initialize),
+         * r (ready) or s (synchronized) state as either the replication slot
+         * is not created or the replication slot is already dropped and the
+         * required WAL files will be present in the publisher. The other
+         * states are not ok as the worker has dependency on the replication
+         * slot/origin in these case:

A slot not created yet refers to the 'i' state, while 'r' and 's'
refer to a slot created previously but already dropped, right?
Shouldn't this comment tell that rather than mixing the assumptions?

Modified

+         * a) SUBREL_STATE_DATASYNC: In this case, the table sync worker will
+         * try to drop the replication slot but as the replication slots will
+         * be created with old subscription id in the publisher and the
+         * upgraded subscriber will not be able to clean the slots in this
+         * case.

Proposal: A relation upgraded while in this state would retain a
replication slot, which could not be dropped by the sync worker
spawned after the upgrade because the subscription ID tracked by the
publisher does not match anymore.

Modified

Note: actually, this would be OK if we are able to keep the OIDs of
the subscribers consistent across upgrades? I'm OK to not do nothing
about that in this patch, to keep it simpler. Just asking in passing.

I will analyze more on this and post the analysis in the subsequent mail.

+         * b) SUBREL_STATE_FINISHEDCOPY: In this case, the tablesync worker will
+         * expect the origin to be already existing as the origin is created
+         * with an old subscription id, tablesync worker will not be able to
+         * find the origin in this case.

Proposal: A tablesync worker spawned to work on a relation upgraded
while in this state would expect an origin ID with the OID of the
subscription used before the upgrade, causing it to fail.

Modified

+ "A list of problem subscriptions is in the file:\n"

Sounds a bit strange, perhaps use an extra "the", as of "the problem
subscriptions"?

Modified

Could it be worth mentioning in the docs that one could also DISABLE
the subscriptions before running the upgrade?

I felt since the changes that we are planning to make won't start the
apply workers during upgrade, there will be no impact even if the
subscriptions are enabled. I felt no need to mention it unless we are
planning to allow starting of apply workers during upgrade.

+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.

Hmm. No need to mention pg_replication_origin_status?

When we create origin, the origin status would be created implicitly,
I felt we need not check on replication origin status and also need
not mention it here.

If I may ask, how did you check that the given relation states were
OK or not OK? Did you hardcode some wait points in tablesync.c up to
where a state is updated in pg_subscription_rel, then shutdown the
cluster before the upgrade to maintain the catalog in this state?
Finally, after the upgrade, you've cross-checked the dependencies on
the slots and origins to see that the spawned sync workers turned
crazy because of the inconsistencies. Right?

I did testing in the same lines that you mentioned. Apart from that I
also reviewed the design where it was using the old subscription id
like in case of table sync workers, the tables sync worker will use
replication using old subscription id. replication slot and
replication origin. I also checked the impact of remote_lsn's.
Few example: IN SUBREL_STATE_DATASYNC state we will try to drop the
replication slot once worker is started but since the slot will be
created with an old subscription, we will not be able to drop the
replication slot and create a leak. Similarly the problem exists with
SUBREL_STATE_FINISHEDCOPY where we will not be able to drop the origin
created with an old sub id.

Thanks for the comments, the attached v13 version patch has the
changes for the same.

Regards,
Vignesh

Attachments:

v13-0001-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v13-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From 81f8423df8672bd29ad214946fb29a282cd6c796 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v13] Preserve the full subscription's state during pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_add_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_add_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init), 's' (data sync) or in 'r' (ready) state, and
will error out if that's not the case, logging the reason for the failure.

Author: Julien Rouhaud, Vignesh C
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  64 +++++
 src/backend/utils/adt/pg_upgrade_support.c | 118 ++++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 201 +++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 123 ++++++++++
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 261 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/tools/pgindent/typedefs.list           |   1 +
 11 files changed, 822 insertions(+), 6 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 46e8a0b746..1a6ad16060 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,70 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize), <literal>r</literal> (ready) or
+       <literal>s</literal> (synchronized). This can be verified by checking
+       <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+    </itemizedlist>
+
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Create all the new tables that were created in the publication during
+       upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..4a3da80e49 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,21 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +311,115 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	if (!OidIsValid(relid))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid relation identifier used: %u", relid));
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	sublsn;
+	char		originname[NAMEDATALEN];
+	RepOriginId originid;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	sublsn = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+	originid = replorigin_by_name(originname, false);
+	replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e863913849..c8a635330a 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4581,6 +4582,99 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			i;
+	int			cur_rel = 0;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[cur_rel].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[cur_rel].dobj.catId.tableoid = relid;
+		subrinfo[cur_rel].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[cur_rel].dobj);
+		subrinfo[cur_rel].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[cur_rel].tblinfo = tblinfo;
+		subrinfo[cur_rel].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[cur_rel].srsublsn = NULL;
+		else
+			subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[cur_rel].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[cur_rel].dobj), fout);
+
+		cur_rel++;
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4607,6 +4701,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4662,17 +4757,20 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	appendPQExpBufferStr(query, "o.remote_lsn AS suboriginremotelsn\n");
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
-						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
+						 "FROM pg_catalog.pg_subscription s\n"
+						 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
+						 "WHERE s.subdbid = (SELECT oid FROM pg_catalog.pg_database\n"
 						 "                   WHERE datname = current_database())");
 
 	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
@@ -4698,6 +4796,7 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4735,6 +4834,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4744,6 +4848,80 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		if (fout->remoteVersion >= 170000)
+		{
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+			appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+			appendPQExpBuffer(query,
+							  ", %u, '%c'",
+							  subrinfo->tblinfo->dobj.catId.oid,
+							  subrinfo->srsubstate);
+
+			if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", '%s'",
+								  subrinfo->srsublsn);
+			else
+				appendPQExpBuffer(query, ", NULL");
+
+			appendPQExpBufferStr(query, ");\n");
+		}
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4824,6 +5002,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10442,6 +10631,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18508,6 +18700,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..62b3d9249b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -671,8 +672,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +711,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +771,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..96e35602ac 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_state(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -111,6 +112,7 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_composite_data_type_usage(&old_cluster);
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
+	check_for_subscription_state(&old_cluster);
 
 	/*
 	 * Logical replication slots can be migrated since PG17. See comments atop
@@ -812,6 +814,127 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize), r (ready) or s (synchronized) state.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	/* Subscription relations state can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subscription_state.txt");
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:%s subscription:%s\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * A slot not created yet refers to the 'i' (initialize) state, while
+		 * 'r' (ready) and 's' (synchronized) states refer to a slot created
+		 * previously but already dropped. These states are supported states
+		 * for upgrade. The other states listed below are not ok:
+		 *
+		 * a) SUBREL_STATE_DATASYNC:A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * tracked by the publisher does not match anymore.
+		 *
+		 * b) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * c) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, n.nspname, c.relname, r.srsubstate "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r', 's') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "database:%s subscription:%s schema:%s relation:%s state:%s not in required state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize), r (ready) or s (synchronized) state.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..ebc6b75257
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,261 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+use File::Path qw(rmtree);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $bindir = $new_sub->config_data('--bindir');
+
+sub insert_line_at_pub
+{
+	my $payload = shift;
+
+	foreach ("tab_upgraded", "tab_not_upgraded")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("tab_upgraded", "tab_not_upgraded")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');
+
+# Setup logical replication, replicating only 1 table
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub FOR TABLE tab_upgraded");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION regress_pub"
+);
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded VALUES (generate_series(2,50), 'after initial sync')"
+);
+$publisher->wait_for_catchup('regress_sub');
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready state.
+# ------------------------------------------------------
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance when the subscription tables are in ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Check the number of rows for each table on each server
+my $result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(50), "check initial tab_upgraded table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(1), "check initial tab_upgraded table data on publisher");
+$result =
+  $old_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(50),
+	"check initial tab_upgraded table data on the old subscriber");
+$result =
+  $old_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(0),
+	"check initial tab_not_upgraded table data on the old subscriber");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if there's a subscription with tables in
+# a state different than 'r' (ready), 'i' (init) and 's' (synchronized).
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$old_sub->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_primary_key");
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subscription_state\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/database:postgres subscription:regress_sub schema:public relation:tab_primary_key state:d not in required state/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Delete the table data so that the primary key violation error will not happen
+# and tab_primary_key reaches ready state.
+$old_sub->safe_psql('postgres', "DELETE FROM tab_primary_key");
+
+$synced_query =
+  "SELECT count(1) = 2 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# ------------------------------------------------------
+# The incremental changes added to the publisher are replicated after upgrade.
+# ------------------------------------------------------
+
+# Stop the old subscriber, insert a row in tab_upgraded and tab_not_upgraded
+# publisher table while it's down and add tab_not_upgraded to the publication.
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+insert_line_at_pub('while old_sub is down');
+
+# Run pg_upgrade
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+	],
+	'run of pg_upgrade for new sub');
+
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_not_upgraded");
+
+$new_sub->start;
+
+# Subscription relations and replication origin remote_lsn should be preserved
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2),
+	"There should be 2 rows in pg_subscription_rel(representing tab_upgraded and tab_primary_key)"
+);
+
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s where os.external_id = 'pg_' || s.oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# There should be no new replicated rows before enabling the subscription
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(50),
+	"tab_upgraded table has no new replicated rows before enabling the subscription"
+);
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(0),
+	"no change in tab_not_upgraded table which is not part of the publication"
+);
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub ENABLE");
+
+$publisher->wait_for_catchup('regress_sub');
+
+# Rows on tab_upgraded should have been replicated, while nothing should happen
+# for tab_not_upgraded.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(0),
+	"no change in table tab_not_upgraded afer enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, only the missing row on tab_not_upgraded should be
+# replicated.
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(51),
+	"check there is no change when there was no changes replicated");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index f14aed422a..c7bf3cbd55 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11382,6 +11382,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index bf50a32119..a4946b40b1 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2660,6 +2660,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#115vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#111)
Re: pg_upgrade and logical replication

On Thu, 9 Nov 2023 at 07:44, Peter Smith <smithpb2250@gmail.com> wrote:

Thanks for addressing my previous review comments.

I re-checked the latest patch v12-0001 and found the following:

======
Commit message

1.
The new SQL binary_upgrade_create_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_create_sub_rel_state(subname text, relid oid,
state char [,sublsn pg_lsn])

~

Looks like v12 accidentally forgot to update this to the modified
function name 'binary_upgrade_add_sub_rel_state'

This is handled in the v13 version patch posted at:
/messages/by-id/CALDaNm0mGz6_69BiJTmEqC8Q0U0x2nMZOs3w9btKOHZZpfC2ow@mail.gmail.com

Regards,
Vignesh

#116Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#114)
Re: pg_upgrade and logical replication

Here are some review comments for patch v13-0001

======
src/bin/pg_dump/pg_dump.c

1. getSubscriptionTables

+ int i_srsublsn;
+ int i;
+ int cur_rel = 0;
+ int ntups;

What is the difference between 'i' and 'cur_rel'?

AFAICT these represent the same tuple index, in which case you might
as well throw away 'cur_rel' and only keep 'i'.

~~~

2. getSubscriptionTables

+ for (i = 0; i < ntups; i++)
+ {
+ Oid cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+ Oid relid = atooid(PQgetvalue(res, i, i_srrelid));
+ TableInfo  *tblinfo;

Since this is all new code, using C99 style for loop variable
declaration of 'i' will be better.

======
src/bin/pg_upgrade/check.c

3. check_for_subscription_state

+check_for_subscription_state(ClusterInfo *cluster)
+{
+ int dbnum;
+ FILE    *script = NULL;
+ char output_path[MAXPGPATH];
+ int ntup;
+
+ /* Subscription relations state can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+ return;
+
+ prep_status("Checking for subscription state");
+
+ snprintf(output_path, sizeof(output_path), "%s/%s",
+ log_opts.basedir,
+ "subscription_state.txt");

I felt this filename ought to be more like
'subscriptions_with_bad_state.txt' because the current name looks like
a normal logfile with nothing to indicate that it is only for the
states of the "bad" subscriptions.

~~~

4.
+ for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+ {

Since this is all new code, using C99 style for loop variable
declaration of 'dbnum' will be better.

~~~

5.
+ * a) SUBREL_STATE_DATASYNC:A relation upgraded while in this state
+ * would retain a replication slot, which could not be dropped by the
+ * sync worker spawned after the upgrade because the subscription ID
+ * tracked by the publisher does not match anymore.

missing whitespace

/SUBREL_STATE_DATASYNC:A relation/SUBREL_STATE_DATASYNC: A relation/

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#117Michael Paquier
michael@paquier.xyz
In reply to: vignesh C (#114)
Re: pg_upgrade and logical replication

On Fri, Nov 10, 2023 at 07:26:18PM +0530, vignesh C wrote:

I did testing in the same lines that you mentioned. Apart from that I
also reviewed the design where it was using the old subscription id
like in case of table sync workers, the tables sync worker will use
replication using old subscription id. replication slot and
replication origin. I also checked the impact of remote_lsn's.
Few example: IN SUBREL_STATE_DATASYNC state we will try to drop the
replication slot once worker is started but since the slot will be
created with an old subscription, we will not be able to drop the
replication slot and create a leak. Similarly the problem exists with
SUBREL_STATE_FINISHEDCOPY where we will not be able to drop the origin
created with an old sub id.

Yeah, I was playing a bit with these states and I can confirm that
leaving around a DATASYNC relation in pg_subscription_rel during
the upgrade would leave a slot on the publisher of the old cluster,
which is no good. It would be an option to explore later what could
be improved, but I'm also looking forward at hearing from the users
first, as what you have here may be enough for the basic purposes we
are trying to cover. FINISHEDCOPY similarly, is not OK. I was able
to get an origin lying around after an upgrade.

Anyway, after a closer lookup, I think that your conclusions regarding
the states that are allowed in the patch during the upgrade have some
flaws.

First, are you sure that SYNCDONE is OK to keep? This catalog state
is set in process_syncing_tables_for_sync(), and just after the code
opens a transaction to clean up the tablesync slot, followed by a
second transaction to clean up the origin. However, imagine that
there is a failure in dropping the slot, the origin, or just in
transaction processing, cannot we finish in a state where the relation
is marked as SYNCDONE in the catalog but still has an origin and/or a
tablesync slot lying around? Assuming that SYNCDONE is an OK state
seems incorrect to me. I am pretty sure that injecting an error in a
code path after the slot is created would equally lead to an
inconsistency.

It seems to me that INIT cannot be relied on for a similar reason.
This state would be set for a new relation in
LogicalRepSyncTableStart(), and the relation would still be in INIT
state when creating the slot via walrcv_create_slot() in a second
transaction started a bit later. However, if we have a failure after
the transaction that created the slot commits, then we'd have an INIT
relation in the catalog that got committed *and* a slot related to it
lying around.

The only state that I can see is possible to rely on safely is READY,
set in the same transaction as when the replication origin is dropped,
because that's the point where we are sure that there are no origin
and no tablesync slot: the READY state is visible in the catalog only
if the transaction dropping the slot succeeds. Even with this one, I
was having the odd feeling that there's a code path where we could
leak something, though I have not seen a problem with after a few
hours of looking at this area.
--
Michael

#118Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#117)
Re: pg_upgrade and logical replication

On Mon, Nov 13, 2023 at 1:52 PM Michael Paquier <michael@paquier.xyz> wrote:

It seems to me that INIT cannot be relied on for a similar reason.
This state would be set for a new relation in
LogicalRepSyncTableStart(), and the relation would still be in INIT
state when creating the slot via walrcv_create_slot() in a second
transaction started a bit later.

Before creating a slot, we changed the state to DATASYNC.

However, if we have a failure after
the transaction that created the slot commits, then we'd have an INIT
relation in the catalog that got committed *and* a slot related to it
lying around.

I don't think this can happen otherwise this could be a problem even
without an upgrade after restart.

--
With Regards,
Amit Kapila.

#119Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#114)
Re: pg_upgrade and logical replication

On Fri, Nov 10, 2023 at 7:26 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the comments, the attached v13 version patch has the
changes for the same.

+
+ ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname,
sizeof(originname));
+ originid = replorigin_by_name(originname, false);
+ replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+    false /* backward */ ,
+    false /* WAL log */ );

This seems to update the origin state only in memory. Is it sufficient
to use this here? Anyway, I think using this requires us to first
acquire RowExclusiveLock on pg_replication_origin something the patch
is doing for some other system table.

--
With Regards,
Amit Kapila.

#120Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#119)
Re: pg_upgrade and logical replication

On Mon, Nov 13, 2023 at 5:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Nov 10, 2023 at 7:26 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the comments, the attached v13 version patch has the
changes for the same.

+
+ ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname,
sizeof(originname));
+ originid = replorigin_by_name(originname, false);
+ replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+    false /* backward */ ,
+    false /* WAL log */ );

This seems to update the origin state only in memory. Is it sufficient
to use this here?

I think it is probably getting ensured by clean shutdown
(shutdown_checkpoint) which happens on the new cluster after calling
this function. We can probably try to add a comment for it. BTW, we
also need to ensure that max_replication_slots is configured to a
value higher than origins we are planning to create on the new
cluster.

--
With Regards,
Amit Kapila.

#121Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#118)
Re: pg_upgrade and logical replication[

On Mon, Nov 13, 2023 at 04:02:27PM +0530, Amit Kapila wrote:

On Mon, Nov 13, 2023 at 1:52 PM Michael Paquier <michael@paquier.xyz> wrote:

It seems to me that INIT cannot be relied on for a similar reason.
This state would be set for a new relation in
LogicalRepSyncTableStart(), and the relation would still be in INIT
state when creating the slot via walrcv_create_slot() in a second
transaction started a bit later.

Before creating a slot, we changed the state to DATASYNC.

Still, playing the devil's advocate, couldn't it be possible that a
server crashes just after the slot got created, then restarts with
max_logical_replication_workers=0? This would keep the catalog in a
state authorized by the upgrade, still leak a replication slot on the
publication side if the node gets upgraded. READY in the catalog
seems to be the only state where we are guaranteed that there is no
origin and no slot remaining around.
--
Michael

#122vignesh C
vignesh21@gmail.com
In reply to: Michael Paquier (#117)
Re: pg_upgrade and logical replication

On Mon, 13 Nov 2023 at 13:52, Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Nov 10, 2023 at 07:26:18PM +0530, vignesh C wrote:

I did testing in the same lines that you mentioned. Apart from that I
also reviewed the design where it was using the old subscription id
like in case of table sync workers, the tables sync worker will use
replication using old subscription id. replication slot and
replication origin. I also checked the impact of remote_lsn's.
Few example: IN SUBREL_STATE_DATASYNC state we will try to drop the
replication slot once worker is started but since the slot will be
created with an old subscription, we will not be able to drop the
replication slot and create a leak. Similarly the problem exists with
SUBREL_STATE_FINISHEDCOPY where we will not be able to drop the origin
created with an old sub id.

Yeah, I was playing a bit with these states and I can confirm that
leaving around a DATASYNC relation in pg_subscription_rel during
the upgrade would leave a slot on the publisher of the old cluster,
which is no good. It would be an option to explore later what could
be improved, but I'm also looking forward at hearing from the users
first, as what you have here may be enough for the basic purposes we
are trying to cover. FINISHEDCOPY similarly, is not OK. I was able
to get an origin lying around after an upgrade.

Anyway, after a closer lookup, I think that your conclusions regarding
the states that are allowed in the patch during the upgrade have some
flaws.

First, are you sure that SYNCDONE is OK to keep? This catalog state
is set in process_syncing_tables_for_sync(), and just after the code
opens a transaction to clean up the tablesync slot, followed by a
second transaction to clean up the origin. However, imagine that
there is a failure in dropping the slot, the origin, or just in
transaction processing, cannot we finish in a state where the relation
is marked as SYNCDONE in the catalog but still has an origin and/or a
tablesync slot lying around? Assuming that SYNCDONE is an OK state
seems incorrect to me. I am pretty sure that injecting an error in a
code path after the slot is created would equally lead to an
inconsistency.

There are couple of things happening here: a) In the first part we
take care of setting subscription relation to SYNCDONE and dropping
the replication slot at publisher node, only if drop replication slot
is successful the relation state will be set to SYNCDONE , if drop
replication slot fails the relation state will still be in
FINISHEDCOPY. So if there is a failure in the drop replication slot we
will not have an issue as the tablesync worker will be in
FINISHEDCOPYstate and this state is not allowed for upgrade. When the
state is in SYNCDONE the tablesync slot will not be present. b) In the
second part we drop the replication origin, even if there is a chance
that drop replication origin fails due to some reason, there will be
no problem as we do not copy the table sync replication origin to the
new cluster while upgrading. Since the table sync replication origin
is not copied to the new cluster there will be no replication origin
leaks.
I feel these issues will not be there in SYNCDONE state.

Regards,
Vignesh

#123Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#121)
Re: pg_upgrade and logical replication[

On Tue, Nov 14, 2023 at 5:52 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Nov 13, 2023 at 04:02:27PM +0530, Amit Kapila wrote:

On Mon, Nov 13, 2023 at 1:52 PM Michael Paquier <michael@paquier.xyz> wrote:

It seems to me that INIT cannot be relied on for a similar reason.
This state would be set for a new relation in
LogicalRepSyncTableStart(), and the relation would still be in INIT
state when creating the slot via walrcv_create_slot() in a second
transaction started a bit later.

Before creating a slot, we changed the state to DATASYNC.

Still, playing the devil's advocate, couldn't it be possible that a
server crashes just after the slot got created, then restarts with
max_logical_replication_workers=0? This would keep the catalog in a
state authorized by the upgrade,

The state should be DATASYNC by that time and I don't think that is an
authorized state by upgrade.

--
With Regards,
Amit Kapila.

#124vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#116)
1 attachment(s)
Re: pg_upgrade and logical replication

On Mon, 13 Nov 2023 at 13:52, Peter Smith <smithpb2250@gmail.com> wrote:

Here are some review comments for patch v13-0001

======
src/bin/pg_dump/pg_dump.c

1. getSubscriptionTables

+ int i_srsublsn;
+ int i;
+ int cur_rel = 0;
+ int ntups;

What is the difference between 'i' and 'cur_rel'?

AFAICT these represent the same tuple index, in which case you might
as well throw away 'cur_rel' and only keep 'i'.

Modified

~~~

2. getSubscriptionTables

+ for (i = 0; i < ntups; i++)
+ {
+ Oid cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+ Oid relid = atooid(PQgetvalue(res, i, i_srrelid));
+ TableInfo  *tblinfo;

Since this is all new code, using C99 style for loop variable
declaration of 'i' will be better.

Modified

======
src/bin/pg_upgrade/check.c

3. check_for_subscription_state

+check_for_subscription_state(ClusterInfo *cluster)
+{
+ int dbnum;
+ FILE    *script = NULL;
+ char output_path[MAXPGPATH];
+ int ntup;
+
+ /* Subscription relations state can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+ return;
+
+ prep_status("Checking for subscription state");
+
+ snprintf(output_path, sizeof(output_path), "%s/%s",
+ log_opts.basedir,
+ "subscription_state.txt");

I felt this filename ought to be more like
'subscriptions_with_bad_state.txt' because the current name looks like
a normal logfile with nothing to indicate that it is only for the
states of the "bad" subscriptions.

I have kept the file name intentionally shorted as we noticed that
when the upgrade of the publisher patch used a longer name there were
some buildfarm failures because of longer names.

~~~

4.
+ for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+ {

Since this is all new code, using C99 style for loop variable
declaration of 'dbnum' will be better.

Modified

~~~

5.
+ * a) SUBREL_STATE_DATASYNC:A relation upgraded while in this state
+ * would retain a replication slot, which could not be dropped by the
+ * sync worker spawned after the upgrade because the subscription ID
+ * tracked by the publisher does not match anymore.

missing whitespace

/SUBREL_STATE_DATASYNC:A relation/SUBREL_STATE_DATASYNC: A relation/

Modified

Also added a couple of missing test cases. The attached v14 version
patch has the changes for the same.

Regards,
Vignesh

Attachments:

v14-0001-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v14-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From 354137c80dfacc30bd0fa85c2f993f34ae5af4b9 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v14] Preserve the full subscription's state during pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_add_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_add_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init), 's' (data sync) or in 'r' (ready) state, and
will error out if that's not the case, logging the reason for the failure.

Author: Julien Rouhaud, Vignesh C
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  72 ++++
 src/backend/utils/adt/pg_upgrade_support.c | 130 +++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 197 ++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 +
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 169 ++++++++-
 src/bin/pg_upgrade/info.c                  |  25 ++
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   1 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 392 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/tools/pgindent/typedefs.list           |   1 +
 13 files changed, 1024 insertions(+), 23 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 46e8a0b746..b824097e87 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,78 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize), <literal>r</literal> (ready) or
+       <literal>s</literal> (synchronized). This can be verified by checking
+       <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Create all the new tables that were created in the publication during
+       upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..75d77d8e22 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,22 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +312,126 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	if (!OidIsValid(relid))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid relation identifier used: %u", relid));
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	sublsn;
+	char		originname[NAMEDATALEN];
+	RepOriginId originid;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	sublsn = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	originid = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e863913849..eb48f12fbe 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4581,6 +4582,95 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4607,6 +4697,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4662,17 +4753,20 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	appendPQExpBufferStr(query, "o.remote_lsn AS suboriginremotelsn\n");
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
-						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
+						 "FROM pg_catalog.pg_subscription s\n"
+						 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+						 "    ON o.external_id = 'pg_' || s.oid::text \n"
+						 "WHERE s.subdbid = (SELECT oid FROM pg_catalog.pg_database\n"
 						 "                   WHERE datname = current_database())");
 
 	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
@@ -4698,6 +4792,7 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4735,6 +4830,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4744,6 +4844,80 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		if (fout->remoteVersion >= 170000)
+		{
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+			appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+			appendPQExpBuffer(query,
+							  ", %u, '%c'",
+							  subrinfo->tblinfo->dobj.catId.oid,
+							  subrinfo->srsubstate);
+
+			if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+				appendPQExpBuffer(query, ", '%s'",
+								  subrinfo->srsublsn);
+			else
+				appendPQExpBuffer(query, ", NULL");
+
+			appendPQExpBufferStr(query, ");\n");
+		}
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4824,6 +4998,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10442,6 +10627,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18508,6 +18696,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..62b3d9249b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -671,8 +672,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +711,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +771,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..9e89c3b2eb 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -20,6 +20,7 @@ static void check_is_install_user(ClusterInfo *cluster);
 static void check_proper_datallowconn(ClusterInfo *cluster);
 static void check_for_prepared_transactions(ClusterInfo *cluster);
 static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_subscription_state(ClusterInfo *cluster);
 static void check_for_user_defined_postfix_ops(ClusterInfo *cluster);
 static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static void check_for_tables_with_oids(ClusterInfo *cluster);
@@ -111,6 +112,7 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_composite_data_type_usage(&old_cluster);
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
+	check_for_subscription_state(&old_cluster);
 
 	/*
 	 * Logical replication slots can be migrated since PG17. See comments atop
@@ -812,6 +814,126 @@ check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize), r (ready) or s (synchronized) state.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	/* Subscription relations state can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subscription_state.txt");
+	for (int dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:%s subscription:%s\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * A slot not created yet refers to the 'i' (initialize) state, while
+		 * 'r' (ready) and 's' (synchronized) states refer to a slot created
+		 * previously but already dropped. These states are supported states
+		 * for upgrade. The other states listed below are not ok:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * tracked by the publisher does not match anymore.
+		 *
+		 * b) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * c) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, n.nspname, c.relname, r.srsubstate "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r', 's') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "database:%s subscription:%s schema:%s relation:%s state:%s not in required state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize), r (ready) or s (synchronized) state.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
+
 /*
  * Verify that no user defined postfix operators exist.
  */
@@ -1470,7 +1592,8 @@ check_for_user_defined_encoding_conversions(ClusterInfo *cluster)
  * check_new_cluster_logical_replication_slots()
  *
  * Verify that there are no logical replication slots on the new cluster and
- * that the parameter settings necessary for creating slots are sufficient.
+ * that the parameter settings necessary for creating slots and subscriptions
+ * are sufficient.
  */
 static void
 check_new_cluster_logical_replication_slots(void)
@@ -1479,6 +1602,7 @@ check_new_cluster_logical_replication_slots(void)
 	PGconn	   *conn;
 	int			nslots_on_old;
 	int			nslots_on_new;
+	int			nsubs_on_old = old_cluster.subscription_count;
 	int			max_replication_slots;
 	char	   *wal_level;
 
@@ -1488,29 +1612,35 @@ check_new_cluster_logical_replication_slots(void)
 
 	nslots_on_old = count_old_cluster_logical_slots();
 
-	/* Quick return if there are no logical slots to be migrated. */
-	if (nslots_on_old == 0)
+	/*
+	 * Quick return if there are no logical slots and subscriptions to be
+	 * migrated.
+	 */
+	if (nslots_on_old == 0 && nsubs_on_old == 0)
 		return;
 
 	conn = connectToServer(&new_cluster, "template1");
 
-	prep_status("Checking for new cluster logical replication slots");
+	if (nslots_on_old)
+	{
+		prep_status("Checking for new cluster logical replication slots");
 
-	res = executeQueryOrDie(conn, "SELECT count(*) "
-							"FROM pg_catalog.pg_replication_slots "
-							"WHERE slot_type = 'logical' AND "
-							"temporary IS FALSE;");
+		res = executeQueryOrDie(conn, "SELECT count(*) "
+								"FROM pg_catalog.pg_replication_slots "
+								"WHERE slot_type = 'logical' AND "
+								"temporary IS FALSE;");
 
-	if (PQntuples(res) != 1)
-		pg_fatal("could not count the number of logical replication slots");
+		if (PQntuples(res) != 1)
+			pg_fatal("could not count the number of logical replication slots");
 
-	nslots_on_new = atoi(PQgetvalue(res, 0, 0));
+		nslots_on_new = atoi(PQgetvalue(res, 0, 0));
 
-	if (nslots_on_new)
-		pg_fatal("Expected 0 logical replication slots but found %d.",
-				 nslots_on_new);
+		if (nslots_on_new)
+			pg_fatal("Expected 0 logical replication slots but found %d.",
+					nslots_on_new);
 
-	PQclear(res);
+		PQclear(res);
+	}
 
 	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
 							"WHERE name IN ('wal_level', 'max_replication_slots') "
@@ -1521,17 +1651,22 @@ check_new_cluster_logical_replication_slots(void)
 
 	wal_level = PQgetvalue(res, 0, 0);
 
-	if (strcmp(wal_level, "logical") != 0)
+	if (nslots_on_old && strcmp(wal_level, "logical") != 0)
 		pg_fatal("wal_level must be \"logical\", but is set to \"%s\"",
 				 wal_level);
 
 	max_replication_slots = atoi(PQgetvalue(res, 1, 0));
 
-	if (nslots_on_old > max_replication_slots)
+	if (nslots_on_old && nslots_on_old > max_replication_slots)
 		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
 				 "logical replication slots (%d) on the old cluster",
 				 max_replication_slots, nslots_on_old);
 
+	if (nsubs_on_old && nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
 	PQclear(res);
 	PQfinish(conn);
 
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..f674ecd52e 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -21,6 +21,7 @@ static void report_unmatched_relation(const RelInfo *rel, const DbInfo *db,
 									  bool is_new_db);
 static void free_db_and_rel_infos(DbInfoArr *db_arr);
 static void get_template0_info(ClusterInfo *cluster);
+static void get_subscription_count(ClusterInfo *cluster);
 static void get_db_infos(ClusterInfo *cluster);
 static void get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo);
 static void free_rel_infos(RelInfoArr *rel_arr);
@@ -286,6 +287,9 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 	get_template0_info(cluster);
 	get_db_infos(cluster);
 
+	if (cluster == &old_cluster)
+		get_subscription_count(cluster);
+
 	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
 	{
 		DbInfo	   *pDbInfo = &cluster->dbarr.dbs[dbnum];
@@ -365,6 +369,27 @@ get_template0_info(ClusterInfo *cluster)
 	PQfinish(conn);
 }
 
+/*
+ * Get the number of subscriptions in the old cluster.
+ */
+static void
+get_subscription_count(ClusterInfo *cluster)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	if (GET_MAJOR_VERSION(cluster->major_version) < 1700)
+		return;
+
+	conn = connectToServer(cluster, "template1");
+	res = executeQueryOrDie(conn,
+							  "SELECT oid FROM pg_catalog.pg_subscription");
+
+	cluster->subscription_count = PQntuples(res);
+
+	PQclear(res);
+	PQfinish(conn);
+}
 
 /*
  * get_db_infos()
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..07cd6ed34c 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -292,6 +292,7 @@ typedef struct
 	char		major_version_str[64];	/* string PG_VERSION of cluster */
 	uint32		bin_version;	/* version returned from pg_ctl */
 	const char *tablespace_suffix;	/* directory specification */
+	int			subscription_count;	/* number of subscriptions */
 } ClusterInfo;
 
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..2823c17e82
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,392 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+use File::Path qw(rmtree);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $bindir = $new_sub->config_data('--bindir');
+
+sub insert_line_at_pub
+{
+	my $payload = shift;
+
+	foreach ("tab_upgraded", "tab_not_upgraded")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("tab_upgraded", "tab_not_upgraded")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');
+
+# Setup logical replication, replicating only 1 table
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub FOR TABLE tab_upgraded");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION regress_pub"
+);
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded VALUES (generate_series(2,50), 'after initial sync')"
+);
+$publisher->wait_for_catchup('regress_sub');
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready state.
+# ------------------------------------------------------
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance when the subscription tables are in ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Check the number of rows for each table on each server
+my $result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(50), "check initial tab_upgraded table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(1), "check initial tab_upgraded table data on publisher");
+$result =
+  $old_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(50),
+	"check initial tab_upgraded table data on the old subscriber");
+$result =
+  $old_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(0),
+	"check initial tab_not_upgraded table data on the old subscriber");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if there's a subscription with tables in
+# a state different than 'r' (ready), 'i' (init) and 's' (synchronized).
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$old_sub->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_primary_key");
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subscription_state\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/database:postgres subscription:regress_sub schema:public relation:tab_primary_key state:d not in required state/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Delete the table data so that the primary key violation error will not happen
+# and tab_primary_key reaches ready state.
+$old_sub->safe_psql('postgres', "DELETE FROM tab_primary_key");
+
+$synced_query =
+  "SELECT count(1) = 2 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# ------------------------------------------------------
+# The incremental changes added to the publisher are replicated after upgrade.
+# ------------------------------------------------------
+
+# Stop the old subscriber, insert a row in tab_upgraded and tab_not_upgraded
+# publisher table while it's down and add tab_not_upgraded to the publication.
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+insert_line_at_pub('while old_sub is down');
+
+# Run pg_upgrade
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode,
+	],
+	'run of pg_upgrade for new sub');
+
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_not_upgraded");
+
+$new_sub->start;
+
+# Subscription relations and replication origin remote_lsn should be preserved
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2),
+	"There should be 2 rows in pg_subscription_rel(representing tab_upgraded and tab_primary_key)"
+);
+
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s where os.external_id = 'pg_' || s.oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# There should be no new replicated rows before enabling the subscription
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(50),
+	"tab_upgraded table has no new replicated rows before enabling the subscription"
+);
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(0),
+	"no change in tab_not_upgraded table which is not part of the publication"
+);
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub ENABLE");
+
+$publisher->wait_for_catchup('regress_sub');
+
+# Rows on tab_upgraded should have been replicated, while nothing should happen
+# for tab_not_upgraded.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(0),
+	"no change in table tab_not_upgraded afer enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, only the missing row on tab_not_upgraded should be
+# replicated.
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(51),
+	"check there is no change when there was no changes replicated");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when the table is in init state.
+# ------------------------------------------------------
+my $old_sub1 = PostgreSQL::Test::Cluster->new('old_sub1');
+$old_sub1->init;
+$old_sub1->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub1->start;
+
+$publisher->safe_psql('postgres',
+	"CREATE TABLE tab(id serial PRIMARY KEY, val text);");
+$old_sub1->safe_psql('postgres',
+	"CREATE TABLE tab(id serial PRIMARY KEY, val text);");
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub1 FOR TABLE tab");
+
+$old_sub1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab values(1, 'before initial sync')");
+
+# The tables will be in init state as the subscriber configuration for
+# max_logical_replication_workers is set to 0.
+$synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'";
+$old_sub1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# Initialize the new subscriber
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+
+$old_sub1->stop;
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub1->data_dir,
+		'-D',         $new_sub1->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub1->host,
+		'-p',         $old_sub1->port,     '-P', $new_sub1->port,
+		$mode,
+	],
+	'run of pg_upgrade --check for old instance when the subscription tables are in ready state'
+);
+ok( !-d $new_sub1->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+$new_sub1->start;
+
+$result =
+  $new_sub1->safe_psql('postgres', "SELECT srsubstate FROM pg_subscription_rel");
+is($result, qq(i), "check tab table is in init state after upgrade");
+
+# Check the number of rows in the table
+$result =
+  $new_sub1->safe_psql('postgres', "SELECT count(*) FROM tab");
+is($result, qq(0), "check initial tab table data on upgraded subscriber");
+
+# Enable the subscription
+$new_sub1->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub1 ENABLE");
+
+$new_sub1->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Check the number of rows in the table
+$result =
+  $new_sub1->safe_psql('postgres', "SELECT count(*) FROM tab");
+is($result, qq(1), "check the data is synced after enabling the subscription");
+
+# ------------------------------------------------------
+# Check that pg_upgrade will fail when the subscription's replication origin
+# does not exist.
+# ------------------------------------------------------
+my $old_sub2 = PostgreSQL::Test::Cluster->new('old_sub2');
+$old_sub2->init;
+$old_sub2->start;
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub2");
+
+$old_sub2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2"
+);
+
+$old_sub2->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub2 disable");
+
+my $subid = $old_sub2->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+my $reporigin = 'pg_'.qq($subid);
+$old_sub2->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')"
+);
+
+# Initialize the new subscriber
+my $new_sub2 = PostgreSQL::Test::Cluster->new('new_sub2');
+$new_sub2->init;
+
+$old_sub2->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub2->data_dir,
+		'-D',         $new_sub2->data_dir, '-b', $bindir,
+		'-B',         $bindir,            '-s', $new_sub2->host,
+		'-p',         $old_sub2->port,     '-P', $new_sub2->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with missing replication origin'
+);
+
+# Find a txt file that contains a list of replication origins that is missing.
+# We cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subscription_state\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub2->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/replication origin is missing for database:postgres subscription:regress_sub2/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index bd0b8873d3..a52dc8f735 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11383,6 +11383,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index bf50a32119..a4946b40b1 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2660,6 +2660,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#125vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#119)
Re: pg_upgrade and logical replication

On Mon, 13 Nov 2023 at 17:02, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Nov 10, 2023 at 7:26 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the comments, the attached v13 version patch has the
changes for the same.

+
+ ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname,
sizeof(originname));
+ originid = replorigin_by_name(originname, false);
+ replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+    false /* backward */ ,
+    false /* WAL log */ );

This seems to update the origin state only in memory. Is it sufficient
to use this here? Anyway, I think using this requires us to first
acquire RowExclusiveLock on pg_replication_origin something the patch
is doing for some other system table.

Added the lock.

The attached v14 patch at [1]/messages/by-id/CALDaNm20=Bk_w9jDZXEqkJ3_NUAxOBswCn4jR-tmh-MqNpPZYw@mail.gmail.com has the changes for the same.
[1]: /messages/by-id/CALDaNm20=Bk_w9jDZXEqkJ3_NUAxOBswCn4jR-tmh-MqNpPZYw@mail.gmail.com

Regards,
Vignesh

#126vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#120)
Re: pg_upgrade and logical replication

On Mon, 13 Nov 2023 at 17:49, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 13, 2023 at 5:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Nov 10, 2023 at 7:26 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the comments, the attached v13 version patch has the
changes for the same.

+
+ ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname,
sizeof(originname));
+ originid = replorigin_by_name(originname, false);
+ replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+    false /* backward */ ,
+    false /* WAL log */ );

This seems to update the origin state only in memory. Is it sufficient
to use this here?

I think it is probably getting ensured by clean shutdown
(shutdown_checkpoint) which happens on the new cluster after calling
this function. We can probably try to add a comment for it. BTW, we
also need to ensure that max_replication_slots is configured to a
value higher than origins we are planning to create on the new
cluster.

Added comments and also added the check for max_replication_slots.

The attached v14 patch at [1]/messages/by-id/CALDaNm20=Bk_w9jDZXEqkJ3_NUAxOBswCn4jR-tmh-MqNpPZYw@mail.gmail.com has the changes for the same.
[1]: /messages/by-id/CALDaNm20=Bk_w9jDZXEqkJ3_NUAxOBswCn4jR-tmh-MqNpPZYw@mail.gmail.com

Regards,
Vignesh

#127Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#124)
Re: pg_upgrade and logical replication

Here are some review comments for patch v14-0001

======
src/backend/utils/adt/pg_upgrade_support.c

1. binary_upgrade_replorigin_advance

+ /* lock to prevent the replication origin from vanishing */
+ LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+ originid = replorigin_by_name(originname, false);

Use uppercase for the lock comment.

======
src/bin/pg_upgrade/check.c

2. check_for_subscription_state

+ prep_status("Checking for subscription state");
+
+ snprintf(output_path, sizeof(output_path), "%s/%s",
+ log_opts.basedir,
+ "subscription_state.txt");

I felt this filename ought to be more like
'subscriptions_with_bad_state.txt' because the current name looks like
a normal logfile with nothing to indicate that it is only for the
states of the "bad" subscriptions.

I have kept the file name intentionally shorted as we noticed that
when the upgrade of the publisher patch used a longer name there were
some buildfarm failures because of longer names.

OK, but how about some other short meaningful name like 'subs_invalid.txt'?

I also thought "state" in the original name was misleading because
this file contains not only subscriptions with bad 'state' but also
subscriptions with missing 'origin'.

~~~

3. check_new_cluster_logical_replication_slots

int nslots_on_old;
int nslots_on_new;
+ int nsubs_on_old = old_cluster.subscription_count;

I felt it might be better to make both these quantities 'unsigned' to
make it more obvious that there are no special meanings for negative
numbers.

~~~

4. check_new_cluster_logical_replication_slots

nslots_on_old = count_old_cluster_logical_slots();

~

IMO the 'nsubs_on_old' should be coded the same as above. AFAICT, this
is the only code where you are interested in the number of
subscribers, and furthermore, it seems you only care about that count
in the *old* cluster. This means the current implementation of
get_subscription_count() seems more generic than it needs to be and
that results in more unnecessary patch code. (I will repeat this same
review comment in the other relevant places).

SUGGESTION
nslots_on_old = count_old_cluster_logical_slots();
nsubs_on_old = count_old_cluster_subscriptions();

~~~

5.
+ /*
+ * Quick return if there are no logical slots and subscriptions to be
+ * migrated.
+ */
+ if (nslots_on_old == 0 && nsubs_on_old == 0)
  return;

/and subscriptions/and no subscriptions/

~~~

6.
- if (nslots_on_old > max_replication_slots)
+ if (nslots_on_old && nslots_on_old > max_replication_slots)
  pg_fatal("max_replication_slots (%d) must be greater than or equal
to the number of "
  "logical replication slots (%d) on the old cluster",
  max_replication_slots, nslots_on_old);

Neither nslots_on_old nor max_replication_slots can be < 0, so I don't
see why the additional check is needed here.
AFAICT "if (nslots_on_old > max_replication_slots)" acheives the same
thing that you want.

~~~

7.
+ if (nsubs_on_old && nsubs_on_old > max_replication_slots)
+ pg_fatal("max_replication_slots (%d) must be greater than or equal
to the number of "
+ "subscriptions (%d) on the old cluster",
+ max_replication_slots, nsubs_on_old);

Neither nsubs_on_old nor max_replication_slots can be < 0, so I don't
see why the additional check is needed here.
AFAICT "if (nsubs_on_old > max_replication_slots)" achieves the same
thing that you want.

======
src/bin/pg_upgrade/info.c

8. get_db_rel_and_slot_infos

+ if (cluster == &old_cluster)
+ get_subscription_count(cluster);
+

I felt this is unnecessary because you only want to know the
nsubs_on_old in one place and then only for the old cluster, so
calling this to set a generic attribute for the cluster is overkill.

~~~

9.
+/*
+ * Get the number of subscriptions in the old cluster.
+ */
+static void
+get_subscription_count(ClusterInfo *cluster)
+{
+ PGconn    *conn;
+ PGresult   *res;
+
+ if (GET_MAJOR_VERSION(cluster->major_version) < 1700)
+ return;
+
+ conn = connectToServer(cluster, "template1");
+ res = executeQueryOrDie(conn,
+   "SELECT oid FROM pg_catalog.pg_subscription");
+
+ cluster->subscription_count = PQntuples(res);
+
+ PQclear(res);
+ PQfinish(conn);
+}

9a.
Currently, this is needed only for the old_cluster (like the function
comment implies), so the parameter is not required.

Also, AFAICT this number is only needed in one place
(check_new_cluster_logical_replication_slots) so IMO it would be
better to make lots of changes to simplify this code:
- change the function name to be like the other one. e.g.
count_old_cluster_subscriptions()
- function to return unsigned

SUGGESTION (something like this...)

unsigned
count_old_cluster_subscriptions(void)
{
unsigned nsubs = 0;

if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
{
PGconn *conn = connectToServer(&old_cluster, "template1");
PGresult *res = executeQueryOrDie(conn,
"SELECT oid FROM pg_catalog.pg_subscription");
nsubs = PQntuples(res);
PQclear(res);
PQfinish(conn);
}

return nsubs;
}

~

9b.
This function is returning 0 (aka not assigning
cluster->subscription_count) for clusters before PG17. IIUC this is
effectively the same behaviour as count_old_cluster_logical_slots()
but probably it needs to be mentioned more in this function comment
why it is like this.

======
src/bin/pg_upgrade/pg_upgrade.h

10.
const char *tablespace_suffix; /* directory specification */
+ int subscription_count; /* number of subscriptions */
} ClusterInfo;

I felt this is not needed because you only need to know the
nsubs_on_old in one place, so you can just call the counting function
from there. Making this a generic attribute for the cluster seems
overkill.

======
src/bin/pg_upgrade/t/004_subscription.pl

11. TEST: Check that pg_upgrade is successful when the table is in init state.

+$synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'";
+$old_sub1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";

But it doesn't get to "synchronize data", so should that message say
more like "Timed out while waiting for the table to reach INIT state"

~

12.
+command_ok(
+ [
+ 'pg_upgrade', '--no-sync',        '-d', $old_sub1->data_dir,
+ '-D',         $new_sub1->data_dir, '-b', $bindir,
+ '-B',         $bindir,            '-s', $new_sub1->host,
+ '-p',         $old_sub1->port,     '-P', $new_sub1->port,
+ $mode,
+ ],
+ 'run of pg_upgrade --check for old instance when the subscription
tables are in ready state'
+);

Should that message say "init state" instead of "ready state"?

~~~

13. TEST: when the subscription's replication origin does not exist.

+$old_sub2->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_sub2 disable");

/disable/DISABLE/

~~~

14.
+my $subid = $old_sub2->safe_psql('postgres',
+ "SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+my $reporigin = 'pg_'.qq($subid);
+$old_sub2->safe_psql('postgres',
+ "SELECT pg_replication_origin_drop('$reporigin')"
+);

Maybe this part needs a comment to say the reason why the origin does
not exist -- it's because you found and explicitly dropped it.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#128Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: vignesh C (#124)
RE: pg_upgrade and logical replication

Dear Vignesh,

Thanks for updating the patch! Here are some comments.
They are mainly cosmetic because I have not read yours these days.

01. binary_upgrade_add_sub_rel_state()

```
+    /* We must check these things before dereferencing the arguments */
+    if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+        elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed")
```

But fourth argument can be NULL, right? I know you copied from other functions,
but they do not accept for all arguments. One approach is that pg_dump explicitly
writes InvalidXLogRecPtr as the fourth argument.

02. binary_upgrade_add_sub_rel_state()

```
+    if (!OidIsValid(relid))
+        ereport(ERROR,
+                errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                errmsg("invalid relation identifier used: %u", relid));
+
+    tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+    if (!HeapTupleIsValid(tup))
+        ereport(ERROR,
+                errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                errmsg("relation %u does not exist", relid))
```

I'm not sure they should be ereport(). Isn't it that they will be never occurred?
Other upgrade funcs do not have ereport(), and I think it does not have to be
translated.

03. binary_upgrade_replorigin_advance()

IIUC this function is very similar to pg_replication_origin_advance(). Can we
extract a common part of them? I think pg_replication_origin_advance() will be
just a wrapper, and binary_upgrade_replorigin_advance() will get the name of
origin and pass to it.

04. binary_upgrade_replorigin_advance()

Even if you do not accept 03, some variable name can be follow the function.

05. getSubscriptions()

```
+ appendPQExpBufferStr(query, "o.remote_lsn AS suboriginremotelsn\n")
```

Hmm, this value is taken anyway, but will be dumed only when the cluster is PG17+.
Should we avoid getting the value like subrunasowner and subpasswordrequired?
Not sure...

06. dumpSubscriptionTable()

Can we assert that remote version is PG17+?

07. check_for_subscription_state()

IIUC, this function is used only for old cluster. Should we follow
check_old_cluster_for_valid_slots()?

08. check_for_subscription_state()

```
+            fprintf(script, "database:%s subscription:%s schema:%s relation:%s state:%s not in required state\n",
+                    active_db->db_name,
+                    PQgetvalue(res, i, 0),
+                    PQgetvalue(res, i, 1),
+                    PQgetvalue(res, i, 2),
+                    PQgetvalue(res, i, 3));
```

IIRC, format strings should be double-quoted.

09. check_new_cluster_logical_replication_slots()

Checks for replication origin were added in check_new_cluster_logical_replication_slots(),
but I felt it became a super function. Can we devide?

10. check_new_cluster_logical_replication_slots()

Even if you reject above, it should be renamed.

11. pg_upgrade.h

```
+ int subscription_count; /* number of subscriptions */
```

Based on other struct, it should be "nsubscriptions".

12. 004_subscription.pl

```
+use File::Path qw(rmtree);
```

I think this is not used.

13. 004_subscription.pl

```
+my $bindir = $new_sub->config_data('--bindir');
```
For extensibility, it might be better to separate for old/new bindir.

14. 004_subscription.pl

```
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
```

Actually, I'm not sure it is really needed. wait_for_subscription_sync() in line 163
ensures that sync are done? Are there any holes around here?

15. 004_subscription.pl

```
+# Check the number of rows for each table on each server
+my $result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(50), "check initial tab_upgraded table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(1), "check initial tab_upgraded table data on publisher");
+$result =
+  $old_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(50),
+    "check initial tab_upgraded table data on the old subscriber");
+$result =
+  $old_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(0),
+    "check initial tab_not_upgraded table data on the old subscriber");
```

I'm not sure they are really needed. At that time pg_upgrade --check is called,
this won't change the state of clusters.

16. pg_proc.dat

```
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
```

Based on other function, descr just should be "for use by pg_upgrade".

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

#129vignesh C
vignesh21@gmail.com
In reply to: vignesh C (#114)
1 attachment(s)
Re: pg_upgrade and logical replication

On Fri, 10 Nov 2023 at 19:26, vignesh C <vignesh21@gmail.com> wrote:

On Thu, 9 Nov 2023 at 12:23, Michael Paquier <michael@paquier.xyz> wrote:

Note: actually, this would be OK if we are able to keep the OIDs of
the subscribers consistent across upgrades? I'm OK to not do nothing
about that in this patch, to keep it simpler. Just asking in passing.

I will analyze more on this and post the analysis in the subsequent mail.

I analyzed further and felt that retaining subscription oid would be
cleaner as subscription/subscription_rel/replication_origin/replication_origin_status
all of these will be using the same oid as earlier and also probably
help in supporting upgrade of subscription in more scenarios later.
Here is a patch to handle the same.

Regards,
Vignesh

Attachments:

upgrade_retain_subscription_oid.patchtext/x-patch; charset=US-ASCII; name=upgrade_retain_subscription_oid.patchDownload
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index edc82c11be..1c7bb4b7cd 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -75,6 +75,9 @@
 /* check if the 'val' has 'bits' set */
 #define IsSet(val, bits)  (((val) & (bits)) == (bits))
 
+/* Potentially set by pg_upgrade_support functions */
+Oid			binary_upgrade_next_pg_subscription_oid = InvalidOid;
+
 /*
  * Structure to hold a bitmap representing the user-provided CREATE/ALTER
  * SUBSCRIPTION command options and the parsed/default values of each of them.
@@ -679,8 +682,23 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	memset(values, 0, sizeof(values));
 	memset(nulls, false, sizeof(nulls));
 
-	subid = GetNewOidWithIndex(rel, SubscriptionObjectIndexId,
-							   Anum_pg_subscription_oid);
+	/* Use binary-upgrade override for pg_subscription.oid? */
+	if (IsBinaryUpgrade)
+	{
+		if (!OidIsValid(binary_upgrade_next_pg_subscription_oid))
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("pg_subscription OID value not set when in binary upgrade mode")));
+
+		subid = binary_upgrade_next_pg_subscription_oid;
+		binary_upgrade_next_pg_subscription_oid = InvalidOid;
+	}
+	else
+	{
+		subid = GetNewOidWithIndex(rel, SubscriptionObjectIndexId,
+								   Anum_pg_subscription_oid);
+	}
+
 	values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
 	values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
 	values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 53cfa72b6f..34c328ea0d 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -59,6 +59,17 @@ binary_upgrade_set_next_pg_type_oid(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
+Datum
+binary_upgrade_set_next_pg_subscription_oid(PG_FUNCTION_ARGS)
+{
+	Oid			subid = PG_GETARG_OID(0);
+
+	CHECK_IS_BINARY_UPGRADE;
+	binary_upgrade_next_pg_subscription_oid = subid;
+
+	PG_RETURN_VOID();
+}
+
 Datum
 binary_upgrade_set_next_array_pg_type_oid(PG_FUNCTION_ARGS)
 {
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4528b7cc39..bc309be2a8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4954,6 +4954,14 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 	appendPQExpBuffer(delq, "DROP SUBSCRIPTION %s;\n",
 					  qsubname);
 
+	if (dopt->binary_upgrade)
+	{
+		appendPQExpBufferStr(query, "\n-- For binary upgrade, must preserve pg_subscription.oid\n");
+		appendPQExpBuffer(query,
+						  "SELECT pg_catalog.binary_upgrade_set_next_pg_subscription_oid('%u'::pg_catalog.oid);\n\n",
+						  subinfo->dobj.catId.oid);
+	}
+
 	appendPQExpBuffer(query, "CREATE SUBSCRIPTION %s CONNECTION ",
 					  qsubname);
 	appendStringLiteralAH(query, subinfo->subconninfo, fout);
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 82a9125ba9..456e777b8d 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -19,6 +19,7 @@
 extern PGDLLIMPORT Oid binary_upgrade_next_pg_tablespace_oid;
 
 extern PGDLLIMPORT Oid binary_upgrade_next_pg_type_oid;
+extern PGDLLIMPORT Oid binary_upgrade_next_pg_subscription_oid;
 extern PGDLLIMPORT Oid binary_upgrade_next_array_pg_type_oid;
 extern PGDLLIMPORT Oid binary_upgrade_next_mrng_pg_type_oid;
 extern PGDLLIMPORT Oid binary_upgrade_next_mrng_array_pg_type_oid;
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 45c681db5e..43cf39acae 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11328,6 +11328,10 @@
   proname => 'binary_upgrade_set_next_pg_type_oid', provolatile => 'v',
   proparallel => 'r', prorettype => 'void', proargtypes => 'oid',
   prosrc => 'binary_upgrade_set_next_pg_type_oid' },
+{ oid => '8406', descr => 'for use by pg_upgrade',
+  proname => 'binary_upgrade_set_next_pg_subscription_oid', provolatile => 'v',
+  proparallel => 'r', prorettype => 'void', proargtypes => 'oid',
+  prosrc => 'binary_upgrade_set_next_pg_subscription_oid' },
 { oid => '3584', descr => 'for use by pg_upgrade',
   proname => 'binary_upgrade_set_next_array_pg_type_oid', provolatile => 'v',
   proparallel => 'r', prorettype => 'void', proargtypes => 'oid',
#130vignesh C
vignesh21@gmail.com
In reply to: vignesh C (#129)
1 attachment(s)
Re: pg_upgrade and logical replication

On Sun, 19 Nov 2023 at 06:52, vignesh C <vignesh21@gmail.com> wrote:

On Fri, 10 Nov 2023 at 19:26, vignesh C <vignesh21@gmail.com> wrote:

On Thu, 9 Nov 2023 at 12:23, Michael Paquier <michael@paquier.xyz> wrote:

Note: actually, this would be OK if we are able to keep the OIDs of
the subscribers consistent across upgrades? I'm OK to not do nothing
about that in this patch, to keep it simpler. Just asking in passing.

I will analyze more on this and post the analysis in the subsequent mail.

I analyzed further and felt that retaining subscription oid would be
cleaner as subscription/subscription_rel/replication_origin/replication_origin_status
all of these will be using the same oid as earlier and also probably
help in supporting upgrade of subscription in more scenarios later.
Here is a patch to handle the same.

Sorry I had attached the older patch, here is the correct updated one.

Regards,
Vignesh

Attachments:

v1-0001-Retain-the-subscription-oids-during-upgrade.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Retain-the-subscription-oids-during-upgrade.patchDownload
From 94b1ca337498f2b2e5368c3f7179ba63dd75954a Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Sun, 19 Nov 2023 06:53:59 +0530
Subject: [PATCH v1] Retain the subscription oids during upgrade.

Retain the subscription oids during upgrade.
---
 src/backend/commands/subscriptioncmds.c    | 22 ++++++++++++++++++++--
 src/backend/utils/adt/pg_upgrade_support.c | 10 ++++++++++
 src/bin/pg_dump/pg_dump.c                  |  8 ++++++++
 src/include/catalog/binary_upgrade.h       |  1 +
 src/include/catalog/pg_proc.dat            |  4 ++++
 5 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index edc82c11be..1c7bb4b7cd 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -75,6 +75,9 @@
 /* check if the 'val' has 'bits' set */
 #define IsSet(val, bits)  (((val) & (bits)) == (bits))
 
+/* Potentially set by pg_upgrade_support functions */
+Oid			binary_upgrade_next_pg_subscription_oid = InvalidOid;
+
 /*
  * Structure to hold a bitmap representing the user-provided CREATE/ALTER
  * SUBSCRIPTION command options and the parsed/default values of each of them.
@@ -679,8 +682,23 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	memset(values, 0, sizeof(values));
 	memset(nulls, false, sizeof(nulls));
 
-	subid = GetNewOidWithIndex(rel, SubscriptionObjectIndexId,
-							   Anum_pg_subscription_oid);
+	/* Use binary-upgrade override for pg_subscription.oid? */
+	if (IsBinaryUpgrade)
+	{
+		if (!OidIsValid(binary_upgrade_next_pg_subscription_oid))
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("pg_subscription OID value not set when in binary upgrade mode")));
+
+		subid = binary_upgrade_next_pg_subscription_oid;
+		binary_upgrade_next_pg_subscription_oid = InvalidOid;
+	}
+	else
+	{
+		subid = GetNewOidWithIndex(rel, SubscriptionObjectIndexId,
+								   Anum_pg_subscription_oid);
+	}
+
 	values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
 	values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
 	values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..f5be088e6e 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -172,6 +172,16 @@ binary_upgrade_set_next_pg_authid_oid(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
+Datum
+binary_upgrade_set_next_pg_subscription_oid(PG_FUNCTION_ARGS)
+{
+	Oid			subid = PG_GETARG_OID(0);
+
+	CHECK_IS_BINARY_UPGRADE;
+	binary_upgrade_next_pg_subscription_oid = subid;
+	PG_RETURN_VOID();
+}
+
 Datum
 binary_upgrade_create_empty_extension(PG_FUNCTION_ARGS)
 {
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 34fd0a86e9..f592d7c979 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4775,6 +4775,14 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 	appendPQExpBuffer(delq, "DROP SUBSCRIPTION %s;\n",
 					  qsubname);
 
+	if (dopt->binary_upgrade)
+	{
+		appendPQExpBufferStr(query, "\n-- For binary upgrade, must preserve pg_subscription.oid\n");
+		appendPQExpBuffer(query,
+						  "SELECT pg_catalog.binary_upgrade_set_next_pg_subscription_oid('%u'::pg_catalog.oid);\n\n",
+						  subinfo->dobj.catId.oid);
+	}
+
 	appendPQExpBuffer(query, "CREATE SUBSCRIPTION %s CONNECTION ",
 					  qsubname);
 	appendStringLiteralAH(query, subinfo->subconninfo, fout);
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 82a9125ba9..dc7b251051 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -32,6 +32,7 @@ extern PGDLLIMPORT RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumbe
 
 extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
 extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
+extern PGDLLIMPORT Oid binary_upgrade_next_pg_subscription_oid;
 
 extern PGDLLIMPORT bool binary_upgrade_record_init_privs;
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fb58dee3bc..4891a236dd 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11396,6 +11396,10 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8406', descr => 'for use by pg_upgrade',
+  proname => 'binary_upgrade_set_next_pg_subscription_oid', provolatile => 'v',
+  proparallel => 'r', prorettype => 'void', proargtypes => 'oid',
+  prosrc => 'binary_upgrade_set_next_pg_subscription_oid' },
 
 # conversion functions
 { oid => '4302',
-- 
2.34.1

#131vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#127)
1 attachment(s)
Re: pg_upgrade and logical replication

On Thu, 16 Nov 2023 at 07:45, Peter Smith <smithpb2250@gmail.com> wrote:

Here are some review comments for patch v14-0001

======
src/backend/utils/adt/pg_upgrade_support.c

1. binary_upgrade_replorigin_advance

+ /* lock to prevent the replication origin from vanishing */
+ LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+ originid = replorigin_by_name(originname, false);

Use uppercase for the lock comment.

Modified

======
src/bin/pg_upgrade/check.c

2. check_for_subscription_state

+ prep_status("Checking for subscription state");
+
+ snprintf(output_path, sizeof(output_path), "%s/%s",
+ log_opts.basedir,
+ "subscription_state.txt");

I felt this filename ought to be more like
'subscriptions_with_bad_state.txt' because the current name looks like
a normal logfile with nothing to indicate that it is only for the
states of the "bad" subscriptions.

I have kept the file name intentionally shorted as we noticed that
when the upgrade of the publisher patch used a longer name there were
some buildfarm failures because of longer names.

OK, but how about some other short meaningful name like 'subs_invalid.txt'?

I also thought "state" in the original name was misleading because
this file contains not only subscriptions with bad 'state' but also
subscriptions with missing 'origin'.

Modified

~~~

3. check_new_cluster_logical_replication_slots

int nslots_on_old;
int nslots_on_new;
+ int nsubs_on_old = old_cluster.subscription_count;

I felt it might be better to make both these quantities 'unsigned' to
make it more obvious that there are no special meanings for negative
numbers.

I have used int itself as all others also use int like in case of
logical slots. I tried making the changes, but the code was not
consistent, so used int like that is used for others.

~~~

4. check_new_cluster_logical_replication_slots

nslots_on_old = count_old_cluster_logical_slots();

~

IMO the 'nsubs_on_old' should be coded the same as above. AFAICT, this
is the only code where you are interested in the number of
subscribers, and furthermore, it seems you only care about that count
in the *old* cluster. This means the current implementation of
get_subscription_count() seems more generic than it needs to be and
that results in more unnecessary patch code. (I will repeat this same
review comment in the other relevant places).

SUGGESTION
nslots_on_old = count_old_cluster_logical_slots();
nsubs_on_old = count_old_cluster_subscriptions();

Modified to keep it similar to logical slot implementation.

~~~

5.
+ /*
+ * Quick return if there are no logical slots and subscriptions to be
+ * migrated.
+ */
+ if (nslots_on_old == 0 && nsubs_on_old == 0)
return;

/and subscriptions/and no subscriptions/

Modified

~~~

6.
- if (nslots_on_old > max_replication_slots)
+ if (nslots_on_old && nslots_on_old > max_replication_slots)
pg_fatal("max_replication_slots (%d) must be greater than or equal
to the number of "
"logical replication slots (%d) on the old cluster",
max_replication_slots, nslots_on_old);

Neither nslots_on_old nor max_replication_slots can be < 0, so I don't
see why the additional check is needed here.
AFAICT "if (nslots_on_old > max_replication_slots)" acheives the same
thing that you want.

This part of code is changed now

~~~

7.
+ if (nsubs_on_old && nsubs_on_old > max_replication_slots)
+ pg_fatal("max_replication_slots (%d) must be greater than or equal
to the number of "
+ "subscriptions (%d) on the old cluster",
+ max_replication_slots, nsubs_on_old);

Neither nsubs_on_old nor max_replication_slots can be < 0, so I don't
see why the additional check is needed here.
AFAICT "if (nsubs_on_old > max_replication_slots)" achieves the same
thing that you want.

This part of code is changed now

======
src/bin/pg_upgrade/info.c

8. get_db_rel_and_slot_infos

+ if (cluster == &old_cluster)
+ get_subscription_count(cluster);
+

I felt this is unnecessary because you only want to know the
nsubs_on_old in one place and then only for the old cluster, so
calling this to set a generic attribute for the cluster is overkill.

We need to do this here because when we do the validation of new
cluster the old cluster will not be running. I have made the flow
similar to logical slots now.

~~~

9.
+/*
+ * Get the number of subscriptions in the old cluster.
+ */
+static void
+get_subscription_count(ClusterInfo *cluster)
+{
+ PGconn    *conn;
+ PGresult   *res;
+
+ if (GET_MAJOR_VERSION(cluster->major_version) < 1700)
+ return;
+
+ conn = connectToServer(cluster, "template1");
+ res = executeQueryOrDie(conn,
+   "SELECT oid FROM pg_catalog.pg_subscription");
+
+ cluster->subscription_count = PQntuples(res);
+
+ PQclear(res);
+ PQfinish(conn);
+}

9a.
Currently, this is needed only for the old_cluster (like the function
comment implies), so the parameter is not required.

Also, AFAICT this number is only needed in one place
(check_new_cluster_logical_replication_slots) so IMO it would be
better to make lots of changes to simplify this code:
- change the function name to be like the other one. e.g.
count_old_cluster_subscriptions()
- function to return unsigned

SUGGESTION (something like this...)

unsigned
count_old_cluster_subscriptions(void)
{
unsigned nsubs = 0;

if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
{
PGconn *conn = connectToServer(&old_cluster, "template1");
PGresult *res = executeQueryOrDie(conn,
"SELECT oid FROM pg_catalog.pg_subscription");
nsubs = PQntuples(res);
PQclear(res);
PQfinish(conn);
}

return nsubs;
}

This function is not needed anymore, making the logic similar to logical slots.

~

9b.
This function is returning 0 (aka not assigning
cluster->subscription_count) for clusters before PG17. IIUC this is
effectively the same behaviour as count_old_cluster_logical_slots()
but probably it needs to be mentioned more in this function comment
why it is like this.

This function is not needed anymore, making the logic similar to logical slots.

======
src/bin/pg_upgrade/pg_upgrade.h

10.
const char *tablespace_suffix; /* directory specification */
+ int subscription_count; /* number of subscriptions */
} ClusterInfo;

I felt this is not needed because you only need to know the
nsubs_on_old in one place, so you can just call the counting function
from there. Making this a generic attribute for the cluster seems
overkill.

We need to do this here because when we do the validation of a new
cluster the old cluster will not be running. I have made the flow
similar to logical slots now.

======
src/bin/pg_upgrade/t/004_subscription.pl

11. TEST: Check that pg_upgrade is successful when the table is in init state.

+$synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'";
+$old_sub1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";

But it doesn't get to "synchronize data", so should that message say
more like "Timed out while waiting for the table to reach INIT state"

Modified

~

12.
+command_ok(
+ [
+ 'pg_upgrade', '--no-sync',        '-d', $old_sub1->data_dir,
+ '-D',         $new_sub1->data_dir, '-b', $bindir,
+ '-B',         $bindir,            '-s', $new_sub1->host,
+ '-p',         $old_sub1->port,     '-P', $new_sub1->port,
+ $mode,
+ ],
+ 'run of pg_upgrade --check for old instance when the subscription
tables are in ready state'
+);

Should that message say "init state" instead of "ready state"?

Modified

~~~

13. TEST: when the subscription's replication origin does not exist.

+$old_sub2->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_sub2 disable");

/disable/DISABLE/

Modified

~~~

14.
+my $subid = $old_sub2->safe_psql('postgres',
+ "SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+my $reporigin = 'pg_'.qq($subid);
+$old_sub2->safe_psql('postgres',
+ "SELECT pg_replication_origin_drop('$reporigin')"
+);

Maybe this part needs a comment to say the reason why the origin does
not exist -- it's because you found and explicitly dropped it.

Modified

The attached v15 version patch has the changes for the same.

Regards,
Vignesh

Attachments:

v15-0001-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v15-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From cc2250f60e3f7cfd5b395acfc239ef72d8b51732 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v15] Preserve the full subscription's state during pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_add_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_add_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init), 's' (data sync) or in 'r' (ready) state, and
will error out if that's not the case, logging the reason for the failure.

Author: Julien Rouhaud, Vignesh C
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  72 +++++
 src/backend/utils/adt/pg_upgrade_support.c | 125 +++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 200 +++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 176 ++++++++++++-
 src/bin/pg_upgrade/info.c                  |  59 ++++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 290 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/tools/pgindent/typedefs.list           |   1 +
 13 files changed, 976 insertions(+), 9 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..5b8863c7fd 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,78 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize), <literal>r</literal> (ready) or
+       <literal>s</literal> (synchronized). This can be verified by checking
+       <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Create all the new tables that were created in the publication during
+       upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..53cfa72b6f 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,22 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +312,121 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	remote_commit;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 34fd0a86e9..8d2a8e4ffa 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4583,6 +4584,95 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4609,6 +4699,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4664,16 +4755,28 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, "o.remote_lsn AS suboriginremotelsn\n");
+	else
+		appendPQExpBufferStr(query, "NULL AS suboriginremotelsn\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4700,6 +4803,7 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4737,6 +4841,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4746,6 +4855,76 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade || fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4826,6 +5005,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10444,6 +10634,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18510,6 +18703,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..62b3d9249b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -671,8 +672,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +711,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +771,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..40f35975d3 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state(ClusterInfo *cluster);
 
 
 /*
@@ -112,6 +114,13 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
+	/*
+	 * Subscription dependencies can be migrated since PG17. See comments atop
+	 * get_old_cluster_subscription_count().
+	 */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+		check_old_cluster_subscription_state(&old_cluster);
+
 	/*
 	 * Logical replication slots can be migrated since PG17. See comments atop
 	 * get_old_cluster_logical_slot_infos().
@@ -237,6 +246,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1488,7 +1499,7 @@ check_new_cluster_logical_replication_slots(void)
 
 	nslots_on_old = count_old_cluster_logical_slots();
 
-	/* Quick return if there are no logical slots to be migrated. */
+	/* Quick return if there are no logical slots to be migrated */
 	if (nslots_on_old == 0)
 		return;
 
@@ -1538,6 +1549,53 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Logical slots can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name IN ('max_replication_slots') "
+							"ORDER BY name DESC;");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1671,119 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize), r (ready) or s (synchronized) state.
+ */
+static void
+check_old_cluster_subscription_state(ClusterInfo *cluster)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * A slot not created yet refers to the 'i' (initialize) state, while
+		 * 'r' (ready) and 's' (synchronized) states refer to a slot created
+		 * previously but already dropped. These states are supported states
+		 * for upgrade. The other states listed below are not ok:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * tracked by the publisher does not match anymore.
+		 *
+		 * b) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * c) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, n.nspname, c.relname, r.srsubstate "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r', 's') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" state:\"%s\" not in required state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize), r (ready) or s (synchronized) state.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..cfa4dcc19c 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,7 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
-
+static void get_old_cluster_subscription_count(DbInfo *dbinfo);
 
 /*
  * gen_db_file_maps()
@@ -293,10 +293,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_old_cluster_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -365,7 +369,6 @@ get_template0_info(ClusterInfo *cluster)
 	PQfinish(conn);
 }
 
-
 /*
  * get_db_infos()
  *
@@ -730,6 +733,56 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_old_cluster_subscription_count()
+ *
+ * Gets the number of subscription count of the database.
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_old_cluster_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+
+	dbinfo->nsubs = PQntuples(res);
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscription for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_old_cluster_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..758499c5b0
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,290 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+sub insert_line_at_pub
+{
+	my $payload = shift;
+
+	foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub FOR TABLE tab_upgraded1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION regress_pub"
+);
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(2,50), 'before initial sync')"
+);
+$publisher->wait_for_catchup('regress_sub');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+
+# The tables will be in init state as the subscriber configuration for
+# max_logical_replication_workers is set to 0.
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach init state";
+
+# Get the replication origin remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+# Insert a row in tab_upgraded1 and tab_not_upgraded1 publisher table while
+# it's down.
+insert_line_at_pub('while old_sub is down');
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $oldbindir,
+		'-B',         $newbindir,         '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Add tab_not_upgraded1 to the publication
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_not_upgraded1");
+
+$new_sub->start;
+
+# Subscription relations should be preserved
+my $result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2),
+	"There should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s where os.external_id = 'pg_' || s.oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# Check the number of rows for each table on each server
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check initial tab_upgraded1 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2), "check initial tab_upgraded2 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2), "check initial tab_not_upgraded1 table data on publisher");
+
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(50),
+	"check initial tab_upgraded1 table data on the new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(0),
+	"check initial tab_upgraded2 table data on upgraded subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"check initial tab_not_upgraded1 table data on the new subscriber");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub ENABLE");
+
+$publisher->wait_for_catchup('regress_sub');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated, while
+# nothing should happen for tab_not_upgraded1.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"no change in table tab_not_upgraded1 afer enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, the missing row on tab_not_upgraded1 should be
+# replicated.
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run a) if there's a subscription with tables
+# in a state different than 'r' (ready), 'i' (init) and 's' (synchronized)
+# and/or b) if the subscription does not have a replication origin.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+
+$publisher->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$old_sub->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub1 FOR TABLE tab_primary_key");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+
+my $subid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+my $reporigin = 'pg_' . qq($subid);
+
+# Drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d',         $old_sub->data_dir,
+		'-D',         $new_sub1->data_dir,
+		'-b',         $oldbindir,
+		'-B',         $newbindir,
+		'-s',         $new_sub1->host,
+		'-p',         $old_sub->port,
+		'-P',         $new_sub1->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/database:\"postgres\" subscription:\"regress_sub1\" schema:\"public\" relation:\"tab_primary_key\" state:\"d\" not in required state/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/replication origin is missing for database:\"postgres\" subscription:\"regress_sub2\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fb58dee3bc..45c681db5e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11396,6 +11396,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index dba3498a13..eaa5c5a7cb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2661,6 +2661,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#132vignesh C
vignesh21@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#128)
Re: pg_upgrade and logical replication

On Thu, 16 Nov 2023 at 18:25, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Vignesh,

Thanks for updating the patch! Here are some comments.
They are mainly cosmetic because I have not read yours these days.

01. binary_upgrade_add_sub_rel_state()

```
+    /* We must check these things before dereferencing the arguments */
+    if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+        elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed")
```

But fourth argument can be NULL, right? I know you copied from other functions,
but they do not accept for all arguments. One approach is that pg_dump explicitly
writes InvalidXLogRecPtr as the fourth argument.

I did not find any problem with this approach, if the lsn is valid
like in ready state, we will send a valid lsn, if lsn is not valid
like in init state we will pass as NULL. This approach was also
suggested at [1]/messages/by-id/ZQvbV2sdzBY6WEBl@paquier.xyz.

02. binary_upgrade_add_sub_rel_state()

```
+    if (!OidIsValid(relid))
+        ereport(ERROR,
+                errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                errmsg("invalid relation identifier used: %u", relid));
+
+    tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+    if (!HeapTupleIsValid(tup))
+        ereport(ERROR,
+                errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                errmsg("relation %u does not exist", relid))
```

I'm not sure they should be ereport(). Isn't it that they will be never occurred?
Other upgrade funcs do not have ereport(), and I think it does not have to be
translated.

I have removed the first check and retained the second one for a sanity check.

03. binary_upgrade_replorigin_advance()

IIUC this function is very similar to pg_replication_origin_advance(). Can we
extract a common part of them? I think pg_replication_origin_advance() will be
just a wrapper, and binary_upgrade_replorigin_advance() will get the name of
origin and pass to it.

We will be able to reduce hardly 4 lines, I felt the existing is better.

04. binary_upgrade_replorigin_advance()

Even if you do not accept 03, some variable name can be follow the function.

Modified

05. getSubscriptions()

```
+ appendPQExpBufferStr(query, "o.remote_lsn AS suboriginremotelsn\n")
```

Hmm, this value is taken anyway, but will be dumed only when the cluster is PG17+.
Should we avoid getting the value like subrunasowner and subpasswordrequired?
Not sure...

Modified

06. dumpSubscriptionTable()

Can we assert that remote version is PG17+?

Modified

07. check_for_subscription_state()

IIUC, this function is used only for old cluster. Should we follow
check_old_cluster_for_valid_slots()?

Modified

08. check_for_subscription_state()

```
+            fprintf(script, "database:%s subscription:%s schema:%s relation:%s state:%s not in required state\n",
+                    active_db->db_name,
+                    PQgetvalue(res, i, 0),
+                    PQgetvalue(res, i, 1),
+                    PQgetvalue(res, i, 2),
+                    PQgetvalue(res, i, 3));
```

IIRC, format strings should be double-quoted.

Modified

09. check_new_cluster_logical_replication_slots()

Checks for replication origin were added in check_new_cluster_logical_replication_slots(),
but I felt it became a super function. Can we devide?

Modified

10. check_new_cluster_logical_replication_slots()

Even if you reject above, it should be renamed.

Since the previous is handled, this is not valid.

11. pg_upgrade.h

```
+ int subscription_count; /* number of subscriptions */
```

Based on other struct, it should be "nsubscriptions".

Modified

12. 004_subscription.pl

```
+use File::Path qw(rmtree);
```

I think this is not used.

Modified

13. 004_subscription.pl

```
+my $bindir = $new_sub->config_data('--bindir');
```
For extensibility, it might be better to separate for old/new bindir.

Modified

14. 004_subscription.pl

```
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
```

Actually, I'm not sure it is really needed. wait_for_subscription_sync() in line 163
ensures that sync are done? Are there any holes around here?

wait_for_subscription_sync will check if table is in syndone or in
ready state, since we are allowing sycndone state, I have removed this
part.

15. 004_subscription.pl

```
+# Check the number of rows for each table on each server
+my $result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(50), "check initial tab_upgraded table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(1), "check initial tab_upgraded table data on publisher");
+$result =
+  $old_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(50),
+    "check initial tab_upgraded table data on the old subscriber");
+$result =
+  $old_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(0),
+    "check initial tab_not_upgraded table data on the old subscriber");
```

I'm not sure they are really needed. At that time pg_upgrade --check is called,
this won't change the state of clusters.

In the newer version, the check has been removed now. So these are required.

16. pg_proc.dat

```
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
```

Based on other function, descr just should be "for use by pg_upgrade".

This was improvised based on one of earlier comments at [1]/messages/by-id/ZQvbV2sdzBY6WEBl@paquier.xyz
The v15 version attached at [2]/messages/by-id/CALDaNm2ssmSFs4bjpfxbkfUbPE=xFSGqxFoip87kF259FG=X2g@mail.gmail.com has the changes for the comments.

[1]: /messages/by-id/ZQvbV2sdzBY6WEBl@paquier.xyz
[2]: /messages/by-id/CALDaNm2ssmSFs4bjpfxbkfUbPE=xFSGqxFoip87kF259FG=X2g@mail.gmail.com

Regards,
Vignesh

#133Michael Paquier
michael@paquier.xyz
In reply to: vignesh C (#130)
Re: pg_upgrade and logical replication

On Sun, Nov 19, 2023 at 06:56:05AM +0530, vignesh C wrote:

On Sun, 19 Nov 2023 at 06:52, vignesh C <vignesh21@gmail.com> wrote:

On Fri, 10 Nov 2023 at 19:26, vignesh C <vignesh21@gmail.com> wrote:

I will analyze more on this and post the analysis in the subsequent mail.

I analyzed further and felt that retaining subscription oid would be
cleaner as subscription/subscription_rel/replication_origin/replication_origin_status
all of these will be using the same oid as earlier and also probably
help in supporting upgrade of subscription in more scenarios later.
Here is a patch to handle the same.

Sorry I had attached the older patch, here is the correct updated one.

Thanks for digging into that. I think that we should consider that
once the main patch is merged and stable in the tree for v17 to get a
more consistent experience. Shouldn't this include a test in the new
TAP test for the upgrade of subscriptions? It should be as simple as
cross-checking the OIDs of the subscriptions before and after the
upgrade.
--
Michael

#134Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#122)
Re: pg_upgrade and logical replication

On Tue, Nov 14, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:

On Mon, 13 Nov 2023 at 13:52, Michael Paquier <michael@paquier.xyz> wrote:

Anyway, after a closer lookup, I think that your conclusions regarding
the states that are allowed in the patch during the upgrade have some
flaws.

First, are you sure that SYNCDONE is OK to keep? This catalog state
is set in process_syncing_tables_for_sync(), and just after the code
opens a transaction to clean up the tablesync slot, followed by a
second transaction to clean up the origin. However, imagine that
there is a failure in dropping the slot, the origin, or just in
transaction processing, cannot we finish in a state where the relation
is marked as SYNCDONE in the catalog but still has an origin and/or a
tablesync slot lying around? Assuming that SYNCDONE is an OK state
seems incorrect to me. I am pretty sure that injecting an error in a
code path after the slot is created would equally lead to an
inconsistency.

There are couple of things happening here: a) In the first part we
take care of setting subscription relation to SYNCDONE and dropping
the replication slot at publisher node, only if drop replication slot
is successful the relation state will be set to SYNCDONE , if drop
replication slot fails the relation state will still be in
FINISHEDCOPY. So if there is a failure in the drop replication slot we
will not have an issue as the tablesync worker will be in
FINISHEDCOPYstate and this state is not allowed for upgrade. When the
state is in SYNCDONE the tablesync slot will not be present. b) In the
second part we drop the replication origin, even if there is a chance
that drop replication origin fails due to some reason, there will be
no problem as we do not copy the table sync replication origin to the
new cluster while upgrading. Since the table sync replication origin
is not copied to the new cluster there will be no replication origin
leaks.

And, this will work because in the SYNCDONE state, while removing the
origin, we are okay with missing origins. It seems not copying the
origin for tablesync workers in this state (SYNCDONE) relies on the
fact that currently, we don't use those origins once the system
reaches the SYNCDONE state but I am not sure it is a good idea to have
such a dependency and that upgrade assuming such things doesn't seems
ideal to me. Personally, I think allowing an upgrade in 'i'
(initialize) state or 'r' (ready) state seems safe because in those
states either slots/origins don't exist or are dropped. What do you
think?

--
With Regards,
Amit Kapila.

#135Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#131)
Re: pg_upgrade and logical replication

Here are some review comments for patch v15-0001

======
src/bin/pg_dump/pg_dump.c

1. getSubscriptions

+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, "o.remote_lsn AS suboriginremotelsn\n");
+ else
+ appendPQExpBufferStr(query, "NULL AS suboriginremotelsn\n");
+

There should be preceding spaces in those append strings to match the
other ones.

~~~

2. dumpSubscriptionTable

+/*
+ * dumpSubscriptionTable
+ *   Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+ DumpOptions *dopt = fout->dopt;
+ SubscriptionInfo *subinfo = subrinfo->subinfo;
+ PQExpBuffer query;
+ char    *tag;
+
+ /* Do nothing in data-only dump */
+ if (dopt->dataOnly)
+ return;
+
+ Assert(fout->dopt->binary_upgrade || fout->remoteVersion >= 170000);

The function comment says this is only for binary-upgrade mode, so why
does the Assert use || (OR)?

======
src/bin/pg_upgrade/check.c

3. check_and_dump_old_cluster

+ /*
+ * Subscription dependencies can be migrated since PG17. See comments atop
+ * get_old_cluster_subscription_count().
+ */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+ check_old_cluster_subscription_state(&old_cluster);
+

Should this be combined with the other adjacent check so there is only
one "if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)"
needed?

~~~

4. check_new_cluster

  check_new_cluster_logical_replication_slots();
+
+ check_new_cluster_subscription_configuration();

When checking the old cluster, the subscription was checked before the
slots, but here for the new cluster, the slots are checked before the
subscription. Maybe it makes no difference but it might be tidier to
do these old/new checks in the same order.

~~~

5. check_new_cluster_logical_replication_slots

- /* Quick return if there are no logical slots to be migrated. */
+ /* Quick return if there are no logical slots to be migrated */

Change is not relevant for this patch.

~~~

6.

+ res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+ "WHERE name IN ('max_replication_slots') "
+ "ORDER BY name DESC;");

Using IN and ORDER BY in this SQL seems unnecessary when you are only
searching for one name.

======
src/bin/pg_upgrade/info.c

7. statics

-
+static void get_old_cluster_subscription_count(DbInfo *dbinfo);

This change also removes an existing blank line -- not sure if that
was intentional

~~~

8.
@@ -365,7 +369,6 @@ get_template0_info(ClusterInfo *cluster)
PQfinish(conn);
}

-
/*
* get_db_infos()
*

This blank line change (before get_db_infos) should not be part of this patch.

~~~

9. get_old_cluster_subscription_count

It seems a slightly misleading function name because this is a PER-DB
count, not a cluster count.

~~~

10.
+ /* Subscriptions can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+ return;

IMO it is better to compare < 1700 instead of <= 1600. It keeps the
code more aligned with the comment.

~~~

11. count_old_cluster_subscriptions

+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscription for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_old_cluster_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+ int nsubs = 0;
+
+ for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+ nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+ return nsubs;
+}

11a.
/subscription/subscriptions/

~

11b.
The code is now consistent with the slots code which looks good. OTOH
I thought that 'pg_catalog.pg_subscription' is shared across all
databases of the cluster, so isn't this code inefficient to be
querying again and again for every database (if there are many of
them) instead of just querying 1 time only for the whole cluster?

======
src/bin/pg_upgrade/t/004_subscription.pl

12.
It is difficult to keep track of all the tables (upgraded and not
upgraded) at each step of these tests. Maybe the comments can be more
explicit along the way. e.g

BEFORE
+# Add tab_not_upgraded1 to the publication

SUGGESTION
+# Add tab_not_upgraded1 to the publication. Now publication has <blah blah>

and

BEFORE
+# Subscription relations should be preserved

SUGGESTION
+# Subscription relations should be preserved. The upgraded won't know
about 'tab_not_upgraded1' because <blah blah>

etc.

~~~

13.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+ "no change in table tab_not_upgraded1 afer enable subscription which
is not part of the publication"

/afer/after/

~~~

14.
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run a) if there's a subscription with tables
+# in a state different than 'r' (ready), 'i' (init) and 's' (synchronized)
+# and/or b) if the subscription does not have a replication origin.
+# ------------------------------------------------------

14a,
/does not have a/has no/

~

14b.
Maybe put a) and b) on newlines to be more readable.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#136Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#134)
Re: pg_upgrade and logical replication

On Mon, Nov 20, 2023 at 09:49:41AM +0530, Amit Kapila wrote:

On Tue, Nov 14, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:

There are couple of things happening here: a) In the first part we
take care of setting subscription relation to SYNCDONE and dropping
the replication slot at publisher node, only if drop replication slot
is successful the relation state will be set to SYNCDONE , if drop
replication slot fails the relation state will still be in
FINISHEDCOPY. So if there is a failure in the drop replication slot we
will not have an issue as the tablesync worker will be in
FINISHEDCOPYstate and this state is not allowed for upgrade. When the
state is in SYNCDONE the tablesync slot will not be present. b) In the
second part we drop the replication origin, even if there is a chance
that drop replication origin fails due to some reason, there will be
no problem as we do not copy the table sync replication origin to the
new cluster while upgrading. Since the table sync replication origin
is not copied to the new cluster there will be no replication origin
leaks.

And, this will work because in the SYNCDONE state, while removing the
origin, we are okay with missing origins. It seems not copying the
origin for tablesync workers in this state (SYNCDONE) relies on the
fact that currently, we don't use those origins once the system
reaches the SYNCDONE state but I am not sure it is a good idea to have
such a dependency and that upgrade assuming such things doesn't seems
ideal to me.

Hmm, yeah, you mean the replorigin_drop_by_name() calls in
tablesync.c. I did not pay much attention about that in the code, but
your point sounds sensible.

(I have not been able to complete an analysis of the risks behind 's'
to convince myself that it is entirely safe, but leaks are scary as
hell if this gets automated across a large fleet of nodes..)

Personally, I think allowing an upgrade in 'i'
(initialize) state or 'r' (ready) state seems safe because in those
states either slots/origins don't exist or are dropped. What do you
think?

I share a similar impression about 's'. From a design point of view,
making the conditions to reach harder in the first implementation
makes the user experience stricter, but that's safer regarding leaks
and it is still possible to relax these choices in the future
depending on the improvement pieces we are able to figure out.
--
Michael

#137vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#135)
1 attachment(s)
Re: pg_upgrade and logical replication

On Mon, 20 Nov 2023 at 10:44, Peter Smith <smithpb2250@gmail.com> wrote:

Here are some review comments for patch v15-0001

======
src/bin/pg_dump/pg_dump.c

1. getSubscriptions

+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, "o.remote_lsn AS suboriginremotelsn\n");
+ else
+ appendPQExpBufferStr(query, "NULL AS suboriginremotelsn\n");
+

There should be preceding spaces in those append strings to match the
other ones.

Modified

~~~

2. dumpSubscriptionTable

+/*
+ * dumpSubscriptionTable
+ *   Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+ DumpOptions *dopt = fout->dopt;
+ SubscriptionInfo *subinfo = subrinfo->subinfo;
+ PQExpBuffer query;
+ char    *tag;
+
+ /* Do nothing in data-only dump */
+ if (dopt->dataOnly)
+ return;
+
+ Assert(fout->dopt->binary_upgrade || fout->remoteVersion >= 170000);

The function comment says this is only for binary-upgrade mode, so why
does the Assert use || (OR)?

Added comments

======
src/bin/pg_upgrade/check.c

3. check_and_dump_old_cluster

+ /*
+ * Subscription dependencies can be migrated since PG17. See comments atop
+ * get_old_cluster_subscription_count().
+ */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+ check_old_cluster_subscription_state(&old_cluster);
+

Should this be combined with the other adjacent check so there is only
one "if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)"
needed?

Modified

~~~

4. check_new_cluster

check_new_cluster_logical_replication_slots();
+
+ check_new_cluster_subscription_configuration();

When checking the old cluster, the subscription was checked before the
slots, but here for the new cluster, the slots are checked before the
subscription. Maybe it makes no difference but it might be tidier to
do these old/new checks in the same order.

Modified

~~~

5. check_new_cluster_logical_replication_slots

- /* Quick return if there are no logical slots to be migrated. */
+ /* Quick return if there are no logical slots to be migrated */

Change is not relevant for this patch.

Removed it

~~~

6.

+ res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+ "WHERE name IN ('max_replication_slots') "
+ "ORDER BY name DESC;");

Using IN and ORDER BY in this SQL seems unnecessary when you are only
searching for one name.

Modified

======
src/bin/pg_upgrade/info.c

7. statics

-
+static void get_old_cluster_subscription_count(DbInfo *dbinfo);

This change also removes an existing blank line -- not sure if that
was intentional

Modified

~~~

8.
@@ -365,7 +369,6 @@ get_template0_info(ClusterInfo *cluster)
PQfinish(conn);
}

-
/*
* get_db_infos()
*

This blank line change (before get_db_infos) should not be part of this patch.

Modified

~~~

9. get_old_cluster_subscription_count

It seems a slightly misleading function name because this is a PER-DB
count, not a cluster count.

Modified

~~~

10.
+ /* Subscriptions can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+ return;

IMO it is better to compare < 1700 instead of <= 1600. It keeps the
code more aligned with the comment.

Modified

~~~

11. count_old_cluster_subscriptions

+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscription for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_old_cluster_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+ int nsubs = 0;
+
+ for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+ nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+ return nsubs;
+}

11a.
/subscription/subscriptions/

Modified

~

11b.
The code is now consistent with the slots code which looks good. OTOH
I thought that 'pg_catalog.pg_subscription' is shared across all
databases of the cluster, so isn't this code inefficient to be
querying again and again for every database (if there are many of
them) instead of just querying 1 time only for the whole cluster?

My earlier version was like that, changed it to keep the code
consistent to logical replication slots.

======
src/bin/pg_upgrade/t/004_subscription.pl

12.
It is difficult to keep track of all the tables (upgraded and not
upgraded) at each step of these tests. Maybe the comments can be more
explicit along the way. e.g

BEFORE
+# Add tab_not_upgraded1 to the publication

SUGGESTION
+# Add tab_not_upgraded1 to the publication. Now publication has <blah blah>

and

BEFORE
+# Subscription relations should be preserved

SUGGESTION
+# Subscription relations should be preserved. The upgraded won't know
about 'tab_not_upgraded1' because <blah blah>

etc.

Modified

~~~

13.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+ "no change in table tab_not_upgraded1 afer enable subscription which
is not part of the publication"

/afer/after/

Modified

~~~

14.
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run a) if there's a subscription with tables
+# in a state different than 'r' (ready), 'i' (init) and 's' (synchronized)
+# and/or b) if the subscription does not have a replication origin.
+# ------------------------------------------------------

14a,
/does not have a/has no/

Modified

~

14b.
Maybe put a) and b) on newlines to be more readable.

Modified

The attached v16 version patch has the changes for the same.

Regards,
Vignesh

Attachments:

v16-0001-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v16-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From b7b969420d59cf6cce656fa0638b87f13141aa27 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v16] Preserve the full subscription's state during pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_add_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_add_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init), 's' (data sync) or in 'r' (ready) state, and
will error out if that's not the case, logging the reason for the failure.

Author: Vignesh C, Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  72 +++++
 src/backend/utils/adt/pg_upgrade_support.c | 125 +++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 200 +++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 182 ++++++++++++-
 src/bin/pg_upgrade/info.c                  |  57 +++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 295 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/tools/pgindent/typedefs.list           |   1 +
 13 files changed, 984 insertions(+), 10 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..5b8863c7fd 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,78 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize), <literal>r</literal> (ready) or
+       <literal>s</literal> (synchronized). This can be verified by checking
+       <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Create all the new tables that were created in the publication during
+       upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..53cfa72b6f 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,22 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +312,121 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	remote_commit;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 34fd0a86e9..eedc1b622f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4583,6 +4584,95 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4609,6 +4699,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4664,16 +4755,28 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn\n");
+	else
+		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4700,6 +4803,7 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4737,6 +4841,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4746,6 +4855,76 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode and for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade || fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4826,6 +5005,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10444,6 +10634,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18510,6 +18703,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..62b3d9249b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -671,8 +672,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +711,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +771,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..df335fde43 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state(ClusterInfo *cluster);
 
 
 /*
@@ -112,13 +114,21 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/*
-	 * Logical replication slots can be migrated since PG17. See comments atop
-	 * get_old_cluster_logical_slot_infos().
-	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+	{
+		/*
+		 * Logical replication slots can be migrated since PG17. See comments
+		 * atop get_old_cluster_logical_slot_infos().
+		 */
 		check_old_cluster_for_valid_slots(live_check);
 
+		/*
+		 * Subscription dependencies can be migrated since PG17. See comments
+		 * atop get_db_subscription_count().
+		 */
+		check_old_cluster_subscription_state(&old_cluster);
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -237,6 +247,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1538,6 +1550,52 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Logical slots can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name = 'max_replication_slots';");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1671,119 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize), r (ready) or s (synchronized) state.
+ */
+static void
+check_old_cluster_subscription_state(ClusterInfo *cluster)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * A slot not created yet refers to the 'i' (initialize) state, while
+		 * 'r' (ready) and 's' (synchronized) states refer to a slot created
+		 * previously but already dropped. These states are supported states
+		 * for upgrade. The other states listed below are not ok:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * tracked by the publisher does not match anymore.
+		 *
+		 * b) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * c) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, n.nspname, c.relname, r.srsubstate "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r', 's') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" state:\"%s\" not in required state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize), r (ready) or s (synchronized) state.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..acdca33b07 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,6 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
+static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,10 +294,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_db_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -730,6 +735,56 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_db_subscription_count()
+ *
+ * Gets the number of subscription count of the database.
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_db_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+
+	dbinfo->nsubs = PQntuples(res);
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscriptions for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_db_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..1a5c14a0f8
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,295 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+sub insert_line_at_pub
+{
+	my $payload = shift;
+
+	foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub FOR TABLE tab_upgraded1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION regress_pub"
+);
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(2,50), 'before initial sync')"
+);
+$publisher->wait_for_catchup('regress_sub');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+
+# Add tab_not_upgraded1 to the publication. Now publication has tab_upgraded1
+# and tab_upgraded2 tables.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+
+# The tables will be in init state as the subscriber configuration for
+# max_logical_replication_workers is set to 0.
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach init state";
+
+# Get the replication origin remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+# Insert a row in tab_upgraded1 and tab_not_upgraded1 publisher table while
+# it's down.
+insert_line_at_pub('while old_sub is down');
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+		'-D',         $new_sub->data_dir, '-b', $oldbindir,
+		'-B',         $newbindir,         '-s', $new_sub->host,
+		'-p',         $old_sub->port,     '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Add tab_not_upgraded1 to the publication. Now publication has tab_upgraded1,
+# tab_upgraded2 and tab_not_upgraded1 tables.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_not_upgraded1");
+
+$new_sub->start;
+
+# Subscription relations should be preserved. The upgraded won't know
+# about 'tab_not_upgraded1' because the subscription is not yet refreshed.
+my $result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2),
+	"There should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s where os.external_id = 'pg_' || s.oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# Check the number of rows for each table on each server
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check initial tab_upgraded1 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2), "check initial tab_upgraded2 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2), "check initial tab_not_upgraded1 table data on publisher");
+
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(50),
+	"check initial tab_upgraded1 table data on the new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(0),
+	"check initial tab_upgraded2 table data on upgraded subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"check initial tab_not_upgraded1 table data on the new subscriber");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub ENABLE");
+
+$publisher->wait_for_catchup('regress_sub');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated, while
+# nothing should happen for tab_not_upgraded1.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"no change in table tab_not_upgraded1 after enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, the missing row on tab_not_upgraded1 should be
+# replicated.
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state different than
+#    'r' (ready), 'i' (init) and 's' (synchronized) and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+
+$publisher->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$old_sub->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub1 FOR TABLE tab_primary_key");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+
+my $subid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+my $reporigin = 'pg_' . qq($subid);
+
+# Drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d',         $old_sub->data_dir,
+		'-D',         $new_sub1->data_dir,
+		'-b',         $oldbindir,
+		'-B',         $newbindir,
+		'-s',         $new_sub1->host,
+		'-p',         $old_sub->port,
+		'-P',         $new_sub1->port,
+		$mode,        '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/database:\"postgres\" subscription:\"regress_sub1\" schema:\"public\" relation:\"tab_primary_key\" state:\"d\" not in required state/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/replication origin is missing for database:\"postgres\" subscription:\"regress_sub2\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fb58dee3bc..45c681db5e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11396,6 +11396,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index dba3498a13..eaa5c5a7cb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2661,6 +2661,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#138Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#137)
Re: pg_upgrade and logical replication

Thanks for addressing my past review comments.

Here are some more review comments for patch v16-0001

======
doc/src/sgml/ref/pgupgrade.sgml

1.
+      <para>
+       Create all the new tables that were created in the publication during
+       upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>

"Create all ... that were created" sounds a bit strange.

SUGGESTION (maybe like this or similar?)
Create equivalent subscriber tables for anything that became newly
part of the publication during the upgrade and....

======
src/bin/pg_dump/pg_dump.c

2. getSubscriptionTables

+/*
+ * getSubscriptionTables
+ *   Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+ DumpOptions *dopt = fout->dopt;
+ SubscriptionInfo *subinfo = NULL;
+ SubRelInfo *subrinfo;
+ PQExpBuffer query;
+ PGresult   *res;
+ int i_srsubid;
+ int i_srrelid;
+ int i_srsubstate;
+ int i_srsublsn;
+ int ntups;
+ Oid last_srsubid = InvalidOid;
+
+ if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+ fout->remoteVersion < 170000)
+ return;

This function comment says "used only in binary-upgrade mode." and the
Assert says the same. But, is this compatible with the other function
dumpSubscriptionTable() where it says "used only in binary-upgrade
mode and for PG17 or later versions"?

======
src/bin/pg_upgrade/check.c

3. check_new_cluster_subscription_configuration

+static void
+check_new_cluster_subscription_configuration(void)
+{
+ PGresult   *res;
+ PGconn    *conn;
+ int nsubs_on_old;
+ int max_replication_slots;
+
+ /* Logical slots can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+ return;

IMO it is better to say < 1700 in this check, instead of <= 1600.

~~~

4.
+ /* Quick return if there are no subscriptions to be migrated */
+ if (nsubs_on_old == 0)
+ return;

Missing period in comment.

~~~

5.
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize), r (ready) or s (synchronized) state.
+ */
+static void
+check_old_cluster_subscription_state(ClusterInfo *cluster)

This function is only for the old cluster (hint: the function name) so
there is no need to pass the 'cluster' parameter here. Just directly
use old_cluster in the function body.

======
src/bin/pg_upgrade/t/004_subscription.pl

6.
+# Add tab_not_upgraded1 to the publication. Now publication has tab_upgraded1
+# and tab_upgraded2 tables.
+$publisher->safe_psql('postgres',
+ "ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");

Typo in comment. You added tab_not_upgraded2, not tab_not_upgraded1

~~

7.
+# Subscription relations should be preserved. The upgraded won't know
+# about 'tab_not_upgraded1' because the subscription is not yet refreshed.

Typo or missing word in comment?

"The upgraded" ??

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#139Shlok Kyal
shlok.kyal.oss@gmail.com
In reply to: Peter Smith (#138)
1 attachment(s)
Re: pg_upgrade and logical replication

On Wed, 22 Nov 2023 at 06:48, Peter Smith <smithpb2250@gmail.com> wrote:

======
doc/src/sgml/ref/pgupgrade.sgml

1.
+      <para>
+       Create all the new tables that were created in the publication during
+       upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>

"Create all ... that were created" sounds a bit strange.

SUGGESTION (maybe like this or similar?)
Create equivalent subscriber tables for anything that became newly
part of the publication during the upgrade and....

Modified

======
src/bin/pg_dump/pg_dump.c

2. getSubscriptionTables

+/*
+ * getSubscriptionTables
+ *   Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+ DumpOptions *dopt = fout->dopt;
+ SubscriptionInfo *subinfo = NULL;
+ SubRelInfo *subrinfo;
+ PQExpBuffer query;
+ PGresult   *res;
+ int i_srsubid;
+ int i_srrelid;
+ int i_srsubstate;
+ int i_srsublsn;
+ int ntups;
+ Oid last_srsubid = InvalidOid;
+
+ if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+ fout->remoteVersion < 170000)
+ return;

This function comment says "used only in binary-upgrade mode." and the
Assert says the same. But, is this compatible with the other function
dumpSubscriptionTable() where it says "used only in binary-upgrade
mode and for PG17 or later versions"?

Modified

======
src/bin/pg_upgrade/check.c

3. check_new_cluster_subscription_configuration

+static void
+check_new_cluster_subscription_configuration(void)
+{
+ PGresult   *res;
+ PGconn    *conn;
+ int nsubs_on_old;
+ int max_replication_slots;
+
+ /* Logical slots can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+ return;

IMO it is better to say < 1700 in this check, instead of <= 1600.

Modified

~~~

4.
+ /* Quick return if there are no subscriptions to be migrated */
+ if (nsubs_on_old == 0)
+ return;

Missing period in comment.

Modified

~~~

5.
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize), r (ready) or s (synchronized) state.
+ */
+static void
+check_old_cluster_subscription_state(ClusterInfo *cluster)

This function is only for the old cluster (hint: the function name) so
there is no need to pass the 'cluster' parameter here. Just directly
use old_cluster in the function body.

Modified

======
src/bin/pg_upgrade/t/004_subscription.pl

6.
+# Add tab_not_upgraded1 to the publication. Now publication has tab_upgraded1
+# and tab_upgraded2 tables.
+$publisher->safe_psql('postgres',
+ "ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");

Typo in comment. You added tab_not_upgraded2, not tab_not_upgraded1

Modified

~~

7.
+# Subscription relations should be preserved. The upgraded won't know
+# about 'tab_not_upgraded1' because the subscription is not yet refreshed.

Typo or missing word in comment?

"The upgraded" ??

Modified

Attached the v17 patch which have the same changes

Thanks,
Shlok Kumar Kyal

Attachments:

v17-0001-Preserve-the-full-subscription-s-state-during-pg.patchapplication/octet-stream; name=v17-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From 7baa03b6057caede126b54ec97dc2b75b25dc256 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v17] Preserve the full subscription's state during pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_add_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_add_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init), 's' (data sync) or in 'r' (ready) state, and
will error out if that's not the case, logging the reason for the failure.

Author: Vignesh C, Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  72 +++++
 src/backend/utils/adt/pg_upgrade_support.c | 125 +++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 200 +++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 183 ++++++++++++-
 src/bin/pg_upgrade/info.c                  |  57 +++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 295 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/tools/pgindent/typedefs.list           |   1 +
 13 files changed, 985 insertions(+), 10 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..def4e26221 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,78 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize), <literal>r</literal> (ready) or
+       <literal>s</literal> (synchronized). This can be verified by checking
+       <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Create equivalent subscriber tables corresponding to tables newly added as
+       part of the publication during the upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..53cfa72b6f 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,22 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +312,121 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	remote_commit;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 34fd0a86e9..ac3106e55e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4583,6 +4584,95 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode and for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4609,6 +4699,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4664,16 +4755,28 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn\n");
+	else
+		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4700,6 +4803,7 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4737,6 +4841,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4746,6 +4855,76 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode and for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade && fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4826,6 +5005,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10444,6 +10634,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18510,6 +18703,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..62b3d9249b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -671,8 +672,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +711,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +771,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..dc1d9fcb9d 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state();
 
 
 /*
@@ -112,13 +114,21 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/*
-	 * Logical replication slots can be migrated since PG17. See comments atop
-	 * get_old_cluster_logical_slot_infos().
-	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+	{
+		/*
+		 * Logical replication slots can be migrated since PG17. See comments
+		 * atop get_old_cluster_logical_slot_infos().
+		 */
 		check_old_cluster_for_valid_slots(live_check);
 
+		/*
+		 * Subscription dependencies can be migrated since PG17. See comments
+		 * atop get_db_subscription_count().
+		 */
+		check_old_cluster_subscription_state();
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -237,6 +247,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1538,6 +1550,52 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Logical slots can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated. */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name = 'max_replication_slots';");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1671,120 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize), r (ready) or s (synchronized) state.
+ */
+static void
+check_old_cluster_subscription_state()
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+	ClusterInfo *cluster = &old_cluster;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * A slot not created yet refers to the 'i' (initialize) state, while
+		 * 'r' (ready) and 's' (synchronized) states refer to a slot created
+		 * previously but already dropped. These states are supported states
+		 * for upgrade. The other states listed below are not ok:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * tracked by the publisher does not match anymore.
+		 *
+		 * b) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * c) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, n.nspname, c.relname, r.srsubstate "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r', 's') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" state:\"%s\" not in required state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize), r (ready) or s (synchronized) state.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..acdca33b07 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,6 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
+static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,10 +294,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_db_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -730,6 +735,56 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_db_subscription_count()
+ *
+ * Gets the number of subscription count of the database.
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_db_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+
+	dbinfo->nsubs = PQntuples(res);
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscriptions for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_db_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..bd07917148
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,295 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+sub insert_line_at_pub
+{
+	my $payload = shift;
+
+	foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub FOR TABLE tab_upgraded1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION regress_pub"
+);
+
+# Wait for the catchup, as we need the subscription rel in ready state
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(2,50), 'before initial sync')"
+);
+$publisher->wait_for_catchup('regress_sub');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+
+# Add tab_upgraded2 to the publication. Now publication has tab_upgraded1
+# and tab_upgraded2 tables.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+
+# The tables will be in init state as the subscriber configuration for
+# max_logical_replication_workers is set to 0.
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach init state";
+
+# Get the replication origin remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+# Insert a row in tab_upgraded1 and tab_not_upgraded1 publisher table while
+# it's down.
+insert_line_at_pub('while old_sub is down');
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Add tab_not_upgraded1 to the publication. Now publication has tab_upgraded1,
+# tab_upgraded2 and tab_not_upgraded1 tables.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_not_upgraded1");
+
+$new_sub->start;
+
+# Subscription relations should be preserved. The upgraded subscriber won't know
+# about 'tab_not_upgraded1' because the subscription is not yet refreshed.
+my $result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2),
+	"There should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s where os.external_id = 'pg_' || s.oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# Check the number of rows for each table on each server
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check initial tab_upgraded1 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2), "check initial tab_upgraded2 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2), "check initial tab_not_upgraded1 table data on publisher");
+
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(50),
+	"check initial tab_upgraded1 table data on the new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(0),
+	"check initial tab_upgraded2 table data on upgraded subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"check initial tab_not_upgraded1 table data on the new subscriber");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub ENABLE");
+
+$publisher->wait_for_catchup('regress_sub');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated, while
+# nothing should happen for tab_not_upgraded1.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"no change in table tab_not_upgraded1 after enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, the missing row on tab_not_upgraded1 should be
+# replicated.
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state different than
+#    'r' (ready), 'i' (init) and 's' (synchronized) and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+
+$publisher->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$old_sub->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub1 FOR TABLE tab_primary_key");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+
+my $subid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+my $reporigin = 'pg_' . qq($subid);
+
+# Drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/database:\"postgres\" subscription:\"regress_sub1\" schema:\"public\" relation:\"tab_primary_key\" state:\"d\" not in required state/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/replication origin is missing for database:\"postgres\" subscription:\"regress_sub2\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fb58dee3bc..45c681db5e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11396,6 +11396,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index dba3498a13..eaa5c5a7cb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2661,6 +2661,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#140Peter Smith
smithpb2250@gmail.com
In reply to: Shlok Kyal (#139)
Re: pg_upgrade and logical replication

Here are some review comments for patch v17-0001

======
src/bin/pg_dump/pg_dump.c

1. getSubscriptionTables

+/*
+ * getSubscriptionTables
+ *   Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode and for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+ DumpOptions *dopt = fout->dopt;
+ SubscriptionInfo *subinfo = NULL;
+ SubRelInfo *subrinfo;
+ PQExpBuffer query;
+ PGresult   *res;
+ int i_srsubid;
+ int i_srrelid;
+ int i_srsubstate;
+ int i_srsublsn;
+ int ntups;
+ Oid last_srsubid = InvalidOid;
+
+ if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+ fout->remoteVersion < 170000)
+ return;

I still felt that the function comment ("used only in binary-upgrade
mode and for PG17 or later") was misleading. IMO that sounds like it
would be OK for PG17 regardless of the binary mode, but the code says
otherwise.

Assuming the code is correct, perhaps the comment should say:
"... used only in binary-upgrade mode for PG17 or later versions."

~~~

2. dumpSubscriptionTable

+/*
+ * dumpSubscriptionTable
+ *   Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode and for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)

(this is the same as the previous review comment #1)

Assuming the code is correct, perhaps the comment should say:
"... used only in binary-upgrade mode for PG17 or later versions."

======
src/bin/pg_upgrade/check.c

3.
+static void
+check_old_cluster_subscription_state()
+{
+ FILE    *script = NULL;
+ char output_path[MAXPGPATH];
+ int ntup;
+ ClusterInfo *cluster = &old_cluster;
+
+ prep_status("Checking for subscription state");
+
+ snprintf(output_path, sizeof(output_path), "%s/%s",
+ log_opts.basedir,
+ "subs_invalid.txt");
+ for (int dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+ {
+ PGresult   *res;
+ DbInfo    *active_db = &cluster->dbarr.dbs[dbnum];
+ PGconn    *conn = connectToServer(cluster, active_db->db_name);

There seems no need for an extra variable ('cluster') here when you
can just reference 'old_cluster' directly in the code, the same as
other functions in this file do all the time.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#141vignesh C
vignesh21@gmail.com
In reply to: Michael Paquier (#136)
1 attachment(s)
Re: pg_upgrade and logical replication

On Tue, 21 Nov 2023 at 07:11, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Nov 20, 2023 at 09:49:41AM +0530, Amit Kapila wrote:

On Tue, Nov 14, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:

There are couple of things happening here: a) In the first part we
take care of setting subscription relation to SYNCDONE and dropping
the replication slot at publisher node, only if drop replication slot
is successful the relation state will be set to SYNCDONE , if drop
replication slot fails the relation state will still be in
FINISHEDCOPY. So if there is a failure in the drop replication slot we
will not have an issue as the tablesync worker will be in
FINISHEDCOPYstate and this state is not allowed for upgrade. When the
state is in SYNCDONE the tablesync slot will not be present. b) In the
second part we drop the replication origin, even if there is a chance
that drop replication origin fails due to some reason, there will be
no problem as we do not copy the table sync replication origin to the
new cluster while upgrading. Since the table sync replication origin
is not copied to the new cluster there will be no replication origin
leaks.

And, this will work because in the SYNCDONE state, while removing the
origin, we are okay with missing origins. It seems not copying the
origin for tablesync workers in this state (SYNCDONE) relies on the
fact that currently, we don't use those origins once the system
reaches the SYNCDONE state but I am not sure it is a good idea to have
such a dependency and that upgrade assuming such things doesn't seems
ideal to me.

Hmm, yeah, you mean the replorigin_drop_by_name() calls in
tablesync.c. I did not pay much attention about that in the code, but
your point sounds sensible.

(I have not been able to complete an analysis of the risks behind 's'
to convince myself that it is entirely safe, but leaks are scary as
hell if this gets automated across a large fleet of nodes..)

Personally, I think allowing an upgrade in 'i'
(initialize) state or 'r' (ready) state seems safe because in those
states either slots/origins don't exist or are dropped. What do you
think?

I share a similar impression about 's'. From a design point of view,
making the conditions to reach harder in the first implementation
makes the user experience stricter, but that's safer regarding leaks
and it is still possible to relax these choices in the future
depending on the improvement pieces we are able to figure out.

Based on the suggestions just to have safe init and ready state, I
have made the changes to handle the same in v18 version patch
attached.

Regards,
Vignesh

Attachments:

v18-0001-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v18-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From 9b04ee88e58204aa8dbfd0a821225a5b0474512c Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v18] Preserve the full subscription's state during pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_add_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_add_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init) or
in 'r' (ready) state, and will error out if that's not the case, logging the
reason for the failure.

Author: Vignesh C, Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  71 +++++
 src/backend/utils/adt/pg_upgrade_support.c | 125 ++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 200 +++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 +
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 185 ++++++++++-
 src/bin/pg_upgrade/info.c                  |  56 +++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 337 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/tools/pgindent/typedefs.list           |   1 +
 13 files changed, 1027 insertions(+), 10 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..1e0104d5a1 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,77 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize) or <literal>r</literal> (ready). This
+       can be verified by checking <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Create equivalent subscriber tables corresponding to tables newly added as
+       part of the publication during the upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..53cfa72b6f 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,22 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +312,121 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	remote_commit;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 34fd0a86e9..39ebd9b3aa 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4583,6 +4584,95 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4609,6 +4699,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4664,16 +4755,28 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn\n");
+	else
+		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4700,6 +4803,7 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4737,6 +4841,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4746,6 +4855,76 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade && fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4826,6 +5005,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10444,6 +10634,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18510,6 +18703,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..62b3d9249b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -671,8 +672,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +711,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +771,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..bf759cdf18 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state(void);
 
 
 /*
@@ -112,13 +114,21 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/*
-	 * Logical replication slots can be migrated since PG17. See comments atop
-	 * get_old_cluster_logical_slot_infos().
-	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+	{
+		/*
+		 * Logical replication slots can be migrated since PG17. See comments
+		 * atop get_old_cluster_logical_slot_infos().
+		 */
 		check_old_cluster_for_valid_slots(live_check);
 
+		/*
+		 * Subscription dependencies can be migrated since PG17. See comments
+		 * atop get_db_subscription_count().
+		 */
+		check_old_cluster_subscription_state();
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -237,6 +247,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1538,6 +1550,52 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Logical slots can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated. */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name = 'max_replication_slots';");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1671,122 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize) or r (ready).
+ */
+static void
+check_old_cluster_subscription_state(void)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &old_cluster.dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(&old_cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * A slot not created yet refers to the 'i' (initialize) state, while
+		 * 'r' (ready) state refer to a slot created previously but already
+		 * dropped. These states are supported states for upgrade. The other
+		 * states listed below are not ok:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * tracked by the publisher does not match anymore.
+		 *
+		 * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+		 * would retain the replication origin in certain cases.
+		 *
+		 * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, n.nspname, c.relname, r.srsubstate "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" state:\"%s\" not in required state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize) or r (ready) state.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..fb8250002f 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,6 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
+static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,10 +294,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_db_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -730,6 +735,55 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_db_subscription_count()
+ *
+ * Gets the number of subscription count of the database.
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_db_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscriptions for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_db_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..98a05a352a
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,337 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+sub insert_line_at_pub
+{
+	my $payload = shift;
+
+	foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub FOR TABLE tab_upgraded1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION regress_pub"
+);
+
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+# After the above wait_for_subscription_sync call the table can be either in
+# 'syncdone' or in 'ready' state. Now wait till the table reaches 'ready' state.
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(2,50), 'before initial sync')"
+);
+$publisher->wait_for_catchup('regress_sub');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+
+# Add tab_upgraded2 to the publication. Now publication has tab_upgraded1
+# and tab_upgraded2 tables.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+
+# The tables will be in init state as the subscriber configuration for
+# max_logical_replication_workers is set to 0.
+$synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach init state";
+
+# Get the replication origin remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+# Insert a row in tab_upgraded1 and tab_not_upgraded1 publisher table while
+# it's down.
+insert_line_at_pub('while old_sub is down');
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Add tab_not_upgraded1 to the publication. Now publication has tab_upgraded1,
+# tab_upgraded2 and tab_not_upgraded1 tables.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_not_upgraded1");
+
+$new_sub->start;
+
+# Subscription relations should be preserved. The upgraded subscriber won't know
+# about 'tab_not_upgraded1' because the subscription is not yet refreshed.
+my $result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2),
+	"There should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s where os.external_id = 'pg_' || s.oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# Check the number of rows for each table on each server
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check initial tab_upgraded1 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2), "check initial tab_upgraded2 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2), "check initial tab_not_upgraded1 table data on publisher");
+
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(50),
+	"check initial tab_upgraded1 table data on the new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(0),
+	"check initial tab_upgraded2 table data on upgraded subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"check initial tab_not_upgraded1 table data on the new subscriber");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub ENABLE");
+
+$publisher->wait_for_catchup('regress_sub');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated, while
+# nothing should happen for tab_not_upgraded1.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"no change in table tab_not_upgraded1 after enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, the missing row on tab_not_upgraded1 should be
+# replicated.
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+$old_sub->stop;
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+$old_sub->start;
+
+# Drop the subscription
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state different than
+#    'r' (ready) or 'i' (init) state and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$old_sub->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub1 FOR TABLE tab_primary_key");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+
+my $subid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+my $reporigin = 'pg_' . qq($subid);
+
+# Drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/database:\"postgres\" subscription:\"regress_sub1\" schema:\"public\" relation:\"tab_primary_key\" state:\"d\" not in required state/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/replication origin is missing for database:\"postgres\" subscription:\"regress_sub2\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fb58dee3bc..45c681db5e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11396,6 +11396,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index dba3498a13..eaa5c5a7cb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2661,6 +2661,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#142vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#140)
Re: pg_upgrade and logical replication

On Thu, 23 Nov 2023 at 05:56, Peter Smith <smithpb2250@gmail.com> wrote:

Here are some review comments for patch v17-0001

======
src/bin/pg_dump/pg_dump.c

1. getSubscriptionTables

+/*
+ * getSubscriptionTables
+ *   Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode and for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+ DumpOptions *dopt = fout->dopt;
+ SubscriptionInfo *subinfo = NULL;
+ SubRelInfo *subrinfo;
+ PQExpBuffer query;
+ PGresult   *res;
+ int i_srsubid;
+ int i_srrelid;
+ int i_srsubstate;
+ int i_srsublsn;
+ int ntups;
+ Oid last_srsubid = InvalidOid;
+
+ if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+ fout->remoteVersion < 170000)
+ return;

I still felt that the function comment ("used only in binary-upgrade
mode and for PG17 or later") was misleading. IMO that sounds like it
would be OK for PG17 regardless of the binary mode, but the code says
otherwise.

Assuming the code is correct, perhaps the comment should say:
"... used only in binary-upgrade mode for PG17 or later versions."

Modified

~~~

2. dumpSubscriptionTable

+/*
+ * dumpSubscriptionTable
+ *   Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode and for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)

(this is the same as the previous review comment #1)

Assuming the code is correct, perhaps the comment should say:
"... used only in binary-upgrade mode for PG17 or later versions."

Modified

======
src/bin/pg_upgrade/check.c

3.
+static void
+check_old_cluster_subscription_state()
+{
+ FILE    *script = NULL;
+ char output_path[MAXPGPATH];
+ int ntup;
+ ClusterInfo *cluster = &old_cluster;
+
+ prep_status("Checking for subscription state");
+
+ snprintf(output_path, sizeof(output_path), "%s/%s",
+ log_opts.basedir,
+ "subs_invalid.txt");
+ for (int dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+ {
+ PGresult   *res;
+ DbInfo    *active_db = &cluster->dbarr.dbs[dbnum];
+ PGconn    *conn = connectToServer(cluster, active_db->db_name);

There seems no need for an extra variable ('cluster') here when you
can just reference 'old_cluster' directly in the code, the same as
other functions in this file do all the time.

Modified

The v18 version patch attached at [1]/messages/by-id/CALDaNm3wyYY5ywFpCwUVW1_Di1af3WxeZggGEDQEu8qa58a7FQ@mail.gmail.com has the changes for the same.
[1]: /messages/by-id/CALDaNm3wyYY5ywFpCwUVW1_Di1af3WxeZggGEDQEu8qa58a7FQ@mail.gmail.com

#143Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#141)
Re: pg_upgrade and logical replication

I have only trivial review comments for patch v18-0001

======
src/bin/pg_upgrade/check.c

1. check_new_cluster_subscription_configuration

+ /*
+ * A slot not created yet refers to the 'i' (initialize) state, while
+ * 'r' (ready) state refer to a slot created previously but already
+ * dropped. These states are supported states for upgrade. The other
+ * states listed below are not ok:
+ *
+ * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+ * would retain a replication slot, which could not be dropped by the
+ * sync worker spawned after the upgrade because the subscription ID
+ * tracked by the publisher does not match anymore.
+ *
+ * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+ * would retain the replication origin in certain cases.
+ *
+ * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+ * a relation upgraded while in this state would expect an origin ID
+ * with the OID of the subscription used before the upgrade, causing
+ * it to fail.
+ *
+ * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+ * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+ * so we need not allow these states.
+ */

1a.
/while 'r' (ready) state refer to a slot/while 'r' (ready) state
refers to a slot/

1b.
/These states are supported states for upgrade./These states are
supported for pg_upgrade./

1c
/The other states listed below are not ok./The other states listed
below are not supported./

======
src/bin/pg_upgrade/t/004_subscription.pl

2.
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state different than
+#    'r' (ready) or 'i' (init) state and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------

/if there's a subscription with tables in a state different than 'r'
(ready) or 'i' (init) state and/if there's a subscription with tables
in a state other than 'r' (ready) or 'i' (init) and/

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#144vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#143)
1 attachment(s)
Re: pg_upgrade and logical replication

On Fri, 24 Nov 2023 at 07:00, Peter Smith <smithpb2250@gmail.com> wrote:

I have only trivial review comments for patch v18-0001

======
src/bin/pg_upgrade/check.c

1. check_new_cluster_subscription_configuration

+ /*
+ * A slot not created yet refers to the 'i' (initialize) state, while
+ * 'r' (ready) state refer to a slot created previously but already
+ * dropped. These states are supported states for upgrade. The other
+ * states listed below are not ok:
+ *
+ * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+ * would retain a replication slot, which could not be dropped by the
+ * sync worker spawned after the upgrade because the subscription ID
+ * tracked by the publisher does not match anymore.
+ *
+ * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+ * would retain the replication origin in certain cases.
+ *
+ * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+ * a relation upgraded while in this state would expect an origin ID
+ * with the OID of the subscription used before the upgrade, causing
+ * it to fail.
+ *
+ * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+ * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+ * so we need not allow these states.
+ */

1a.
/while 'r' (ready) state refer to a slot/while 'r' (ready) state
refers to a slot/

Modified

1b.
/These states are supported states for upgrade./These states are
supported for pg_upgrade./

Modified

1c
/The other states listed below are not ok./The other states listed
below are not supported./

Modified

======
src/bin/pg_upgrade/t/004_subscription.pl

2.
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state different than
+#    'r' (ready) or 'i' (init) state and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------

/if there's a subscription with tables in a state different than 'r'
(ready) or 'i' (init) state and/if there's a subscription with tables
in a state other than 'r' (ready) or 'i' (init) and/

Modified

The attached v19 version patch has the changes for the same.

Regards,
Vignesh

Attachments:

v19-0001-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v19-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From 5f3a248ff1f723c01d55bcaeb665c5f8f38824a1 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v19] Preserve the full subscription's state during pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_add_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_add_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init) or
in 'r' (ready) state, and will error out if that's not the case, logging the
reason for the failure.

Author: Vignesh C, Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  71 +++++
 src/backend/utils/adt/pg_upgrade_support.c | 125 ++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 200 +++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 +
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 185 ++++++++++-
 src/bin/pg_upgrade/info.c                  |  56 +++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 337 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/tools/pgindent/typedefs.list           |   1 +
 13 files changed, 1027 insertions(+), 10 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..1e0104d5a1 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,77 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize) or <literal>r</literal> (ready). This
+       can be verified by checking <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Create equivalent subscriber tables corresponding to tables newly added as
+       part of the publication during the upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..53cfa72b6f 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,22 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +312,121 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	remote_commit;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 34fd0a86e9..39ebd9b3aa 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4583,6 +4584,95 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4609,6 +4699,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4664,16 +4755,28 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn\n");
+	else
+		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4700,6 +4803,7 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4737,6 +4841,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4746,6 +4855,76 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade && fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4826,6 +5005,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10444,6 +10634,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18510,6 +18703,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..62b3d9249b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -671,8 +672,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +711,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +771,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..e21bd56cb0 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state(void);
 
 
 /*
@@ -112,13 +114,21 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/*
-	 * Logical replication slots can be migrated since PG17. See comments atop
-	 * get_old_cluster_logical_slot_infos().
-	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+	{
+		/*
+		 * Logical replication slots can be migrated since PG17. See comments
+		 * atop get_old_cluster_logical_slot_infos().
+		 */
 		check_old_cluster_for_valid_slots(live_check);
 
+		/*
+		 * Subscription dependencies can be migrated since PG17. See comments
+		 * atop get_db_subscription_count().
+		 */
+		check_old_cluster_subscription_state();
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -237,6 +247,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1538,6 +1550,52 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Logical slots can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated. */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name = 'max_replication_slots';");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1671,122 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize) or r (ready).
+ */
+static void
+check_old_cluster_subscription_state(void)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &old_cluster.dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(&old_cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * A slot not created yet refers to the 'i' (initialize) state, while
+		 * 'r' (ready) state refers to a slot created previously but already
+		 * dropped. These states are supported for pg_upgrade. The other
+		 * states listed below are not supported:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * tracked by the publisher does not match anymore.
+		 *
+		 * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+		 * would retain the replication origin in certain cases.
+		 *
+		 * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, n.nspname, c.relname, r.srsubstate "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" state:\"%s\" not in required state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize) or r (ready) state.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..fb8250002f 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,6 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
+static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,10 +294,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_db_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -730,6 +735,55 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_db_subscription_count()
+ *
+ * Gets the number of subscription count of the database.
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_db_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscriptions for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_db_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..af5bbb4103
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,337 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+sub insert_line_at_pub
+{
+	my $payload = shift;
+
+	foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub FOR TABLE tab_upgraded1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION regress_pub"
+);
+
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+# After the above wait_for_subscription_sync call the table can be either in
+# 'syncdone' or in 'ready' state. Now wait till the table reaches 'ready' state.
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(2,50), 'before initial sync')"
+);
+$publisher->wait_for_catchup('regress_sub');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+
+# Add tab_upgraded2 to the publication. Now publication has tab_upgraded1
+# and tab_upgraded2 tables.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+
+# The tables will be in init state as the subscriber configuration for
+# max_logical_replication_workers is set to 0.
+$synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach init state";
+
+# Get the replication origin remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+# Insert a row in tab_upgraded1 and tab_not_upgraded1 publisher table while
+# it's down.
+insert_line_at_pub('while old_sub is down');
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Add tab_not_upgraded1 to the publication. Now publication has tab_upgraded1,
+# tab_upgraded2 and tab_not_upgraded1 tables.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_not_upgraded1");
+
+$new_sub->start;
+
+# Subscription relations should be preserved. The upgraded subscriber won't know
+# about 'tab_not_upgraded1' because the subscription is not yet refreshed.
+my $result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2),
+	"There should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s where os.external_id = 'pg_' || s.oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# Check the number of rows for each table on each server
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check initial tab_upgraded1 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2), "check initial tab_upgraded2 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2), "check initial tab_not_upgraded1 table data on publisher");
+
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(50),
+	"check initial tab_upgraded1 table data on the new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(0),
+	"check initial tab_upgraded2 table data on upgraded subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"check initial tab_not_upgraded1 table data on the new subscriber");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub ENABLE");
+
+$publisher->wait_for_catchup('regress_sub');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated, while
+# nothing should happen for tab_not_upgraded1.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"no change in table tab_not_upgraded1 after enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, the missing row on tab_not_upgraded1 should be
+# replicated.
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+$old_sub->stop;
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+$old_sub->start;
+
+# Drop the subscription
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$old_sub->safe_psql('postgres',
+	"CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql('postgres',
+	"INSERT INTO tab_primary_key values(1, 'before initial sync')");
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub1 FOR TABLE tab_primary_key");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+
+my $subid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+my $reporigin = 'pg_' . qq($subid);
+
+# Drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/database:\"postgres\" subscription:\"regress_sub1\" schema:\"public\" relation:\"tab_primary_key\" state:\"d\" not in required state/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/replication origin is missing for database:\"postgres\" subscription:\"regress_sub2\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fb58dee3bc..45c681db5e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11396,6 +11396,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index dba3498a13..eaa5c5a7cb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2661,6 +2661,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#145vignesh C
vignesh21@gmail.com
In reply to: Michael Paquier (#133)
2 attachment(s)
Re: pg_upgrade and logical replication

On Mon, 20 Nov 2023 at 05:27, Michael Paquier <michael@paquier.xyz> wrote:

On Sun, Nov 19, 2023 at 06:56:05AM +0530, vignesh C wrote:

On Sun, 19 Nov 2023 at 06:52, vignesh C <vignesh21@gmail.com> wrote:

On Fri, 10 Nov 2023 at 19:26, vignesh C <vignesh21@gmail.com> wrote:

I will analyze more on this and post the analysis in the subsequent mail.

I analyzed further and felt that retaining subscription oid would be
cleaner as subscription/subscription_rel/replication_origin/replication_origin_status
all of these will be using the same oid as earlier and also probably
help in supporting upgrade of subscription in more scenarios later.
Here is a patch to handle the same.

Sorry I had attached the older patch, here is the correct updated one.

Thanks for digging into that. I think that we should consider that
once the main patch is merged and stable in the tree for v17 to get a
more consistent experience.

Yes, that approach makes sense.

Shouldn't this include a test in the new
TAP test for the upgrade of subscriptions? It should be as simple as
cross-checking the OIDs of the subscriptions before and after the
upgrade.

Added a test for the same.

The changes for the same are present in v19-0002 patch.

Regards,
Vignesh

Attachments:

v19-0001-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v19-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From 4bcd9d6c92a1959c15c14e480d3707c644934eab Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v19 1/2] Preserve the full subscription's state during
 pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_add_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_add_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init) or
in 'r' (ready) state, and will error out if that's not the case, logging the
reason for the failure.

Author: Vignesh C, Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  71 +++++
 src/backend/utils/adt/pg_upgrade_support.c | 125 ++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 200 ++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  16 +
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 185 +++++++++++-
 src/bin/pg_upgrade/info.c                  |  56 +++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 333 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/tools/pgindent/typedefs.list           |   1 +
 13 files changed, 1023 insertions(+), 10 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..1e0104d5a1 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,77 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize) or <literal>r</literal> (ready). This
+       can be verified by checking <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Create equivalent subscriber tables corresponding to tables newly added as
+       part of the publication during the upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..53cfa72b6f 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,22 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +312,121 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	remote_commit;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 34fd0a86e9..39ebd9b3aa 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4583,6 +4584,95 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4609,6 +4699,7 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
 	int			i,
 				ntups;
 
@@ -4664,16 +4755,28 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn\n");
+	else
+		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4700,6 +4803,7 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4737,6 +4841,11 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4746,6 +4855,76 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade && fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4826,6 +5005,17 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000 &&
+		subinfo->suboriginremotelsn)
+	{
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+		appendStringLiteralAH(query, subinfo->dobj.name, fout);
+		appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10444,6 +10634,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18510,6 +18703,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..62b3d9249b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -671,8 +672,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +711,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +771,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..e21bd56cb0 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state(void);
 
 
 /*
@@ -112,13 +114,21 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/*
-	 * Logical replication slots can be migrated since PG17. See comments atop
-	 * get_old_cluster_logical_slot_infos().
-	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+	{
+		/*
+		 * Logical replication slots can be migrated since PG17. See comments
+		 * atop get_old_cluster_logical_slot_infos().
+		 */
 		check_old_cluster_for_valid_slots(live_check);
 
+		/*
+		 * Subscription dependencies can be migrated since PG17. See comments
+		 * atop get_db_subscription_count().
+		 */
+		check_old_cluster_subscription_state();
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -237,6 +247,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1538,6 +1550,52 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Logical slots can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated. */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name = 'max_replication_slots';");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1671,122 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize) or r (ready).
+ */
+static void
+check_old_cluster_subscription_state(void)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &old_cluster.dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(&old_cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * A slot not created yet refers to the 'i' (initialize) state, while
+		 * 'r' (ready) state refers to a slot created previously but already
+		 * dropped. These states are supported for pg_upgrade. The other
+		 * states listed below are not supported:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * tracked by the publisher does not match anymore.
+		 *
+		 * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+		 * would retain the replication origin in certain cases.
+		 *
+		 * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, n.nspname, c.relname, r.srsubstate "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" state:\"%s\" not in required state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize) or r (ready) state.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..fb8250002f 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,6 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
+static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,10 +294,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_db_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -730,6 +735,55 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_db_subscription_count()
+ *
+ * Gets the number of subscription count of the database.
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_db_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscriptions for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_db_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..7f6085751a
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,333 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+sub insert_line_at_pub
+{
+	my $payload = shift;
+
+	foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub FOR TABLE tab_upgraded1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION regress_pub"
+);
+
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+# After the above wait_for_subscription_sync call the table can be either in
+# 'syncdone' or in 'ready' state. Now wait till the table reaches 'ready' state.
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(2,50), 'before initial sync')"
+);
+$publisher->wait_for_catchup('regress_sub');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+
+# Add tab_upgraded2 to the publication. Now publication has tab_upgraded1
+# and tab_upgraded2 tables.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+
+# The tables will be in init state as the subscriber configuration for
+# max_logical_replication_workers is set to 0.
+$synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach init state";
+
+# Get the replication origin remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status");
+$old_sub->stop;
+
+# Insert a row in tab_upgraded1 and tab_not_upgraded1 publisher table while
+# it's down.
+insert_line_at_pub('while old_sub is down');
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Add tab_not_upgraded1 to the publication. Now publication has tab_upgraded1,
+# tab_upgraded2 and tab_not_upgraded1 tables.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_not_upgraded1");
+
+$new_sub->start;
+
+# Subscription relations should be preserved. The upgraded subscriber won't know
+# about 'tab_not_upgraded1' because the subscription is not yet refreshed.
+my $result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2),
+	"There should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s where os.external_id = 'pg_' || s.oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# Check the number of rows for each table on each server
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check initial tab_upgraded1 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2), "check initial tab_upgraded2 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2), "check initial tab_not_upgraded1 table data on publisher");
+
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(50),
+	"check initial tab_upgraded1 table data on the new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(0),
+	"check initial tab_upgraded2 table data on upgraded subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"check initial tab_not_upgraded1 table data on the new subscriber");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub ENABLE");
+
+$publisher->wait_for_catchup('regress_sub');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated, while
+# nothing should happen for tab_not_upgraded1.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"no change in table tab_not_upgraded1 after enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, the missing row on tab_not_upgraded1 should be
+# replicated.
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+$old_sub->stop;
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+$old_sub->start;
+
+# Drop the subscription
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);
+		INSERT INTO tab_primary_key values(1, 'before initial sync');
+		CREATE PUBLICATION regress_pub1 FOR TABLE tab_primary_key;
+]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);
+		INSERT INTO tab_primary_key values(1, 'before initial sync');
+		CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1;
+]);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (enabled=false)"
+);
+
+my $subid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+my $reporigin = 'pg_' . qq($subid);
+
+# Drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/database:\"postgres\" subscription:\"regress_sub1\" schema:\"public\" relation:\"tab_primary_key\" state:\"d\" not in required state/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/replication origin is missing for database:\"postgres\" subscription:\"regress_sub2\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fb58dee3bc..45c681db5e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11396,6 +11396,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index dba3498a13..eaa5c5a7cb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2661,6 +2661,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

v19-0002-Retain-the-subscription-oids-during-upgrade.patchtext/x-patch; charset=US-ASCII; name=v19-0002-Retain-the-subscription-oids-during-upgrade.patchDownload
From e21e353335897d303c96d73a057f55e5e9fdca20 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Fri, 24 Nov 2023 21:07:39 +0530
Subject: [PATCH v19 2/2] Retain the subscription oids during upgrade.

Retain the subscription oids during upgrade.
---
 src/backend/commands/subscriptioncmds.c       | 22 +++++++++++++++++--
 src/backend/utils/adt/pg_upgrade_support.c    | 10 +++++++++
 src/bin/pg_dump/pg_dump.c                     |  8 +++++++
 src/bin/pg_upgrade/t/004_subscription.pl      |  9 ++++++++
 src/include/catalog/binary_upgrade.h          |  1 +
 src/include/catalog/pg_proc.dat               |  4 ++++
 .../expected/spgist_name_ops.out              |  6 +++--
 7 files changed, 56 insertions(+), 4 deletions(-)

diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index edc82c11be..1c7bb4b7cd 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -75,6 +75,9 @@
 /* check if the 'val' has 'bits' set */
 #define IsSet(val, bits)  (((val) & (bits)) == (bits))
 
+/* Potentially set by pg_upgrade_support functions */
+Oid			binary_upgrade_next_pg_subscription_oid = InvalidOid;
+
 /*
  * Structure to hold a bitmap representing the user-provided CREATE/ALTER
  * SUBSCRIPTION command options and the parsed/default values of each of them.
@@ -679,8 +682,23 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	memset(values, 0, sizeof(values));
 	memset(nulls, false, sizeof(nulls));
 
-	subid = GetNewOidWithIndex(rel, SubscriptionObjectIndexId,
-							   Anum_pg_subscription_oid);
+	/* Use binary-upgrade override for pg_subscription.oid? */
+	if (IsBinaryUpgrade)
+	{
+		if (!OidIsValid(binary_upgrade_next_pg_subscription_oid))
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("pg_subscription OID value not set when in binary upgrade mode")));
+
+		subid = binary_upgrade_next_pg_subscription_oid;
+		binary_upgrade_next_pg_subscription_oid = InvalidOid;
+	}
+	else
+	{
+		subid = GetNewOidWithIndex(rel, SubscriptionObjectIndexId,
+								   Anum_pg_subscription_oid);
+	}
+
 	values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
 	values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
 	values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 53cfa72b6f..9445bf2aaf 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -179,6 +179,16 @@ binary_upgrade_set_next_pg_authid_oid(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
+Datum
+binary_upgrade_set_next_pg_subscription_oid(PG_FUNCTION_ARGS)
+{
+	Oid			subid = PG_GETARG_OID(0);
+
+	CHECK_IS_BINARY_UPGRADE;
+	binary_upgrade_next_pg_subscription_oid = subid;
+	PG_RETURN_VOID();
+}
+
 Datum
 binary_upgrade_create_empty_extension(PG_FUNCTION_ARGS)
 {
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 39ebd9b3aa..601c1d72b9 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4954,6 +4954,14 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 	appendPQExpBuffer(delq, "DROP SUBSCRIPTION %s;\n",
 					  qsubname);
 
+	if (dopt->binary_upgrade)
+	{
+		appendPQExpBufferStr(query, "\n-- For binary upgrade, must preserve pg_subscription.oid\n");
+		appendPQExpBuffer(query,
+						  "SELECT pg_catalog.binary_upgrade_set_next_pg_subscription_oid('%u'::pg_catalog.oid);\n\n",
+						  subinfo->dobj.catId.oid);
+	}
+
 	appendPQExpBuffer(query, "CREATE SUBSCRIPTION %s CONNECTION ",
 					  qsubname);
 	appendStringLiteralAH(query, subinfo->subconninfo, fout);
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
index 7f6085751a..174bfa516b 100644
--- a/src/bin/pg_upgrade/t/004_subscription.pl
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -101,6 +101,11 @@ $old_sub->poll_query_until('postgres', $synced_query)
 # Get the replication origin remote_lsn of the old subscriber
 my $remote_lsn = $old_sub->safe_psql('postgres',
 	"SELECT remote_lsn FROM pg_replication_origin_status");
+
+# Get the subscription oid of the old subscriber
+my $sub_oid =
+  $old_sub->safe_psql('postgres', "SELECT oid FROM pg_subscription");
+
 $old_sub->stop;
 
 # Insert a row in tab_upgraded1 and tab_not_upgraded1 publisher table while
@@ -141,6 +146,10 @@ $result = $new_sub->safe_psql('postgres',
 );
 is($result, qq($remote_lsn), "remote_lsn should have been preserved");
 
+# The subscription oid should be preserved
+$result = $new_sub->safe_psql('postgres', "SELECT oid FROM pg_subscription");
+is($result, qq($sub_oid), "subscription oid should have been preserved");
+
 # Check the number of rows for each table on each server
 $result =
   $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 82a9125ba9..dc7b251051 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -32,6 +32,7 @@ extern PGDLLIMPORT RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumbe
 
 extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
 extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
+extern PGDLLIMPORT Oid binary_upgrade_next_pg_subscription_oid;
 
 extern PGDLLIMPORT bool binary_upgrade_record_init_privs;
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 45c681db5e..27184212c7 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11406,6 +11406,10 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'void',
   proargtypes => 'text pg_lsn',
   prosrc => 'binary_upgrade_replorigin_advance' },
+{ oid => '8406', descr => 'for use by pg_upgrade',
+  proname => 'binary_upgrade_set_next_pg_subscription_oid', provolatile => 'v',
+  proparallel => 'r', prorettype => 'void', proargtypes => 'oid',
+  prosrc => 'binary_upgrade_set_next_pg_subscription_oid' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/test/modules/spgist_name_ops/expected/spgist_name_ops.out b/src/test/modules/spgist_name_ops/expected/spgist_name_ops.out
index 1ee65ede24..39d43368c4 100644
--- a/src/test/modules/spgist_name_ops/expected/spgist_name_ops.out
+++ b/src/test/modules/spgist_name_ops/expected/spgist_name_ops.out
@@ -59,11 +59,12 @@ select * from t
  binary_upgrade_set_next_multirange_pg_type_oid       |  1 | binary_upgrade_set_next_multirange_pg_type_oid
  binary_upgrade_set_next_pg_authid_oid                |    | binary_upgrade_set_next_pg_authid_oid
  binary_upgrade_set_next_pg_enum_oid                  |    | binary_upgrade_set_next_pg_enum_oid
+ binary_upgrade_set_next_pg_subscription_oid          |    | binary_upgrade_set_next_pg_subscription_oid
  binary_upgrade_set_next_pg_tablespace_oid            |    | binary_upgrade_set_next_pg_tablespace_oid
  binary_upgrade_set_next_pg_type_oid                  |    | binary_upgrade_set_next_pg_type_oid
  binary_upgrade_set_next_toast_pg_class_oid           |  1 | binary_upgrade_set_next_toast_pg_class_oid
  binary_upgrade_set_next_toast_relfilenode            |    | binary_upgrade_set_next_toast_relfilenode
-(13 rows)
+(14 rows)
 
 -- Verify clean failure when INCLUDE'd columns result in overlength tuple
 -- The error message details are platform-dependent, so show only SQLSTATE
@@ -108,11 +109,12 @@ select * from t
  binary_upgrade_set_next_multirange_pg_type_oid       |  1 | binary_upgrade_set_next_multirange_pg_type_oid
  binary_upgrade_set_next_pg_authid_oid                |    | binary_upgrade_set_next_pg_authid_oid
  binary_upgrade_set_next_pg_enum_oid                  |    | binary_upgrade_set_next_pg_enum_oid
+ binary_upgrade_set_next_pg_subscription_oid          |    | binary_upgrade_set_next_pg_subscription_oid
  binary_upgrade_set_next_pg_tablespace_oid            |    | binary_upgrade_set_next_pg_tablespace_oid
  binary_upgrade_set_next_pg_type_oid                  |    | binary_upgrade_set_next_pg_type_oid
  binary_upgrade_set_next_toast_pg_class_oid           |  1 | binary_upgrade_set_next_toast_pg_class_oid
  binary_upgrade_set_next_toast_relfilenode            |    | binary_upgrade_set_next_toast_relfilenode
-(13 rows)
+(14 rows)
 
 \set VERBOSITY sqlstate
 insert into t values(repeat('xyzzy', 12), 42, repeat('xyzzy', 4000));
-- 
2.34.1

#146Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#145)
Re: pg_upgrade and logical replication

On Sat, Nov 25, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:

Few comments on v19:
==================
1.
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... ENABLE</command></link>.

The reason for this restriction is not very clear to me. Is it because
we are using pg_dump for subscription and the existing functionality
is doing it? If so, I think currently even connect is false.

2.
+ * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+ * would retain the replication origin in certain cases.

I think this is vague. Can we briefly describe cases where the origins
would be retained?

3. I think the cases where the publisher is also upgraded restoring
the origin's LSN is of no use. Currently, I can't see a problem with
restoring stale originLSN in such cases as we won't be able to
distinguish during the upgrade but I think we should document it in
the comments somewhere in the patch.

--
With Regards,
Amit Kapila.

#147Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#145)
Re: pg_upgrade and logical replication

Here are some review comments for patch set v19*

//////

v19-0001.

No comments

///////

v19-0002.

(I saw that both changes below seemed cut/paste from similar
functions, but I will ask the questions anyway).

======
src/backend/commands/subscriptioncmds.c

1.
+/* Potentially set by pg_upgrade_support functions */
+Oid binary_upgrade_next_pg_subscription_oid = InvalidOid;
+

The comment "by pg_upgrade_support functions" seemed a bit vague. IMO
you might as well tell the name of the function that sets this.

SUGGESTION
Potentially set by the pg_upgrade_support function --
binary_upgrade_set_next_pg_subscription_oid().

~~~

2. CreateSubscription

+ if (!OidIsValid(binary_upgrade_next_pg_subscription_oid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("pg_subscription OID value not set when in binary upgrade mode")));

Doesn't this condition mean some kind of impossible internal error
occurred -- i.e. should this be elog instead of ereport?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#148vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#146)
Re: pg_upgrade and logical replication

On Sat, 25 Nov 2023 at 17:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Nov 25, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:

Few comments on v19:
==================
1.
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... ENABLE</command></link>.

The reason for this restriction is not very clear to me. Is it because
we are using pg_dump for subscription and the existing functionality
is doing it? If so, I think currently even connect is false.

This was done this way so that the apply worker doesn't get started
while the upgrade is happening. Now that we have set
max_logical_replication_workers to 0, the apply workers will not get
started during the upgrade process. I think now we can create the
subscriptions with the same options as the old cluster in case of
upgrade.

2.
+ * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+ * would retain the replication origin in certain cases.

I think this is vague. Can we briefly describe cases where the origins
would be retained?

I will modify this in the next version

3. I think the cases where the publisher is also upgraded restoring
the origin's LSN is of no use. Currently, I can't see a problem with
restoring stale originLSN in such cases as we won't be able to
distinguish during the upgrade but I think we should document it in
the comments somewhere in the patch.

I will add a comment for this in the next version

Regards,
Vignesh

#149Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#148)
Re: pg_upgrade and logical replication

On Mon, Nov 27, 2023 at 3:18 PM vignesh C <vignesh21@gmail.com> wrote:

On Sat, 25 Nov 2023 at 17:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Nov 25, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:

Few comments on v19:
==================
1.
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... ENABLE</command></link>.

The reason for this restriction is not very clear to me. Is it because
we are using pg_dump for subscription and the existing functionality
is doing it? If so, I think currently even connect is false.

This was done this way so that the apply worker doesn't get started
while the upgrade is happening. Now that we have set
max_logical_replication_workers to 0, the apply workers will not get
started during the upgrade process. I think now we can create the
subscriptions with the same options as the old cluster in case of
upgrade.

Okay, but what is your plan to change it. Currently, we are relying on
existing pg_dump code to dump subscriptions data, do you want to
change that? There is a reason for the current behavior of pg_dump
which as mentioned in docs is: "When dumping logical replication
subscriptions, pg_dump will generate CREATE SUBSCRIPTION commands that
use the connect = false option, so that restoring the subscription
does not make remote connections for creating a replication slot or
for initial table copy. That way, the dump can be restored without
requiring network access to the remote servers. It is then up to the
user to reactivate the subscriptions in a suitable way. If the
involved hosts have changed, the connection information might have to
be changed. It might also be appropriate to truncate the target tables
before initiating a new full table copy."

I guess one reason to not enable subscription after restore was that
it can't work without origins, and also one can restore the dump in a
totally different environment, and one may choose not to dump all the
corresponding tables which I don't think is true for an upgrade. So,
that could be one reason to do differently for upgrades. Do we see
reasons similar to pg_dump/restore due to which after upgrade
subscriptions may not work?

--
With Regards,
Amit Kapila.

#150vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#149)
2 attachment(s)
Re: pg_upgrade and logical replication

On Mon, 27 Nov 2023 at 17:12, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 27, 2023 at 3:18 PM vignesh C <vignesh21@gmail.com> wrote:

On Sat, 25 Nov 2023 at 17:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Nov 25, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:

Few comments on v19:
==================
1.
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... ENABLE</command></link>.

The reason for this restriction is not very clear to me. Is it because
we are using pg_dump for subscription and the existing functionality
is doing it? If so, I think currently even connect is false.

This was done this way so that the apply worker doesn't get started
while the upgrade is happening. Now that we have set
max_logical_replication_workers to 0, the apply workers will not get
started during the upgrade process. I think now we can create the
subscriptions with the same options as the old cluster in case of
upgrade.

Okay, but what is your plan to change it. Currently, we are relying on
existing pg_dump code to dump subscriptions data, do you want to
change that? There is a reason for the current behavior of pg_dump
which as mentioned in docs is: "When dumping logical replication
subscriptions, pg_dump will generate CREATE SUBSCRIPTION commands that
use the connect = false option, so that restoring the subscription
does not make remote connections for creating a replication slot or
for initial table copy. That way, the dump can be restored without
requiring network access to the remote servers. It is then up to the
user to reactivate the subscriptions in a suitable way. If the
involved hosts have changed, the connection information might have to
be changed. It might also be appropriate to truncate the target tables
before initiating a new full table copy."

I guess one reason to not enable subscription after restore was that
it can't work without origins, and also one can restore the dump in a
totally different environment, and one may choose not to dump all the
corresponding tables which I don't think is true for an upgrade. So,
that could be one reason to do differently for upgrades. Do we see
reasons similar to pg_dump/restore due to which after upgrade
subscriptions may not work?

I felt that the behavior for upgrade can be slightly different than
the dump as the subscription relations and the replication origin will
be updated when the subscriber is upgraded. And as the logical
replication workers will not be started during the upgrade we can
preserve the subscription enabled status too. I felt just adding an
"ALTER SUBSCRIPTION sub-name ENABLE" for the subscriptions that were
enabled in the old cluster in case of upgrade like in the attached
patch should be fine. The behavior of dump is not changed it is
retained as it is.

Regards,
Vignesh

Attachments:

v20-0002-Retain-the-subscription-oids-during-upgrade.patchtext/x-patch; charset=US-ASCII; name=v20-0002-Retain-the-subscription-oids-during-upgrade.patchDownload
From 7bbce54014434f23ba1e30390bbb903ebf174134 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Tue, 28 Nov 2023 15:35:42 +0530
Subject: [PATCH v20 2/2] Retain the subscription oids during upgrade.

Retain the subscription oids during upgrade.
---
 src/backend/commands/subscriptioncmds.c       | 25 +++++++++++++++++--
 src/backend/utils/adt/pg_upgrade_support.c    | 10 ++++++++
 src/bin/pg_dump/pg_dump.c                     |  8 ++++++
 src/bin/pg_upgrade/t/004_subscription.pl      |  4 +++
 src/include/catalog/binary_upgrade.h          |  1 +
 src/include/catalog/pg_proc.dat               |  4 +++
 .../expected/spgist_name_ops.out              |  6 +++--
 7 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index edc82c11be..f839989208 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -75,6 +75,12 @@
 /* check if the 'val' has 'bits' set */
 #define IsSet(val, bits)  (((val) & (bits)) == (bits))
 
+/*
+ * This will be set by the pg_upgrade_support function --
+ * binary_upgrade_set_next_pg_subscription_oid().
+ */
+Oid			binary_upgrade_next_pg_subscription_oid = InvalidOid;
+
 /*
  * Structure to hold a bitmap representing the user-provided CREATE/ALTER
  * SUBSCRIPTION command options and the parsed/default values of each of them.
@@ -679,8 +685,23 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	memset(values, 0, sizeof(values));
 	memset(nulls, false, sizeof(nulls));
 
-	subid = GetNewOidWithIndex(rel, SubscriptionObjectIndexId,
-							   Anum_pg_subscription_oid);
+	/* Use binary-upgrade override for pg_subscription.oid? */
+	if (IsBinaryUpgrade)
+	{
+		if (!OidIsValid(binary_upgrade_next_pg_subscription_oid))
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("pg_subscription OID value not set when in binary upgrade mode")));
+
+		subid = binary_upgrade_next_pg_subscription_oid;
+		binary_upgrade_next_pg_subscription_oid = InvalidOid;
+	}
+	else
+	{
+		subid = GetNewOidWithIndex(rel, SubscriptionObjectIndexId,
+								   Anum_pg_subscription_oid);
+	}
+
 	values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
 	values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
 	values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 53cfa72b6f..9445bf2aaf 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -179,6 +179,16 @@ binary_upgrade_set_next_pg_authid_oid(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
+Datum
+binary_upgrade_set_next_pg_subscription_oid(PG_FUNCTION_ARGS)
+{
+	Oid			subid = PG_GETARG_OID(0);
+
+	CHECK_IS_BINARY_UPGRADE;
+	binary_upgrade_next_pg_subscription_oid = subid;
+	PG_RETURN_VOID();
+}
+
 Datum
 binary_upgrade_create_empty_extension(PG_FUNCTION_ARGS)
 {
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4a4bafba11..d008a5caaf 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4963,6 +4963,14 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 	appendPQExpBuffer(delq, "DROP SUBSCRIPTION %s;\n",
 					  qsubname);
 
+	if (dopt->binary_upgrade)
+	{
+		appendPQExpBufferStr(query, "\n-- For binary upgrade, must preserve pg_subscription.oid\n");
+		appendPQExpBuffer(query,
+						  "SELECT pg_catalog.binary_upgrade_set_next_pg_subscription_oid('%u'::pg_catalog.oid);\n\n",
+						  subinfo->dobj.catId.oid);
+	}
+
 	appendPQExpBuffer(query, "CREATE SUBSCRIPTION %s CONNECTION ",
 					  qsubname);
 	appendStringLiteralAH(query, subinfo->subconninfo, fout);
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
index 0b35afa1b6..924e69734b 100644
--- a/src/bin/pg_upgrade/t/004_subscription.pl
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -171,6 +171,10 @@ $result = $new_sub->safe_psql('postgres',
 is($result, qq($remote_lsn), "remote_lsn should have been preserved");
 
 
+# The subscription oid should be preserved
+$result = $new_sub->safe_psql('postgres', "SELECT oid FROM pg_subscription");
+is($result, qq($sub_oid), "subscription oid should have been preserved");
+
 # Check the number of rows for each table on each server
 $result =
   $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 82a9125ba9..dc7b251051 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -32,6 +32,7 @@ extern PGDLLIMPORT RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumbe
 
 extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
 extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
+extern PGDLLIMPORT Oid binary_upgrade_next_pg_subscription_oid;
 
 extern PGDLLIMPORT bool binary_upgrade_record_init_privs;
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 45c681db5e..27184212c7 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11406,6 +11406,10 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'void',
   proargtypes => 'text pg_lsn',
   prosrc => 'binary_upgrade_replorigin_advance' },
+{ oid => '8406', descr => 'for use by pg_upgrade',
+  proname => 'binary_upgrade_set_next_pg_subscription_oid', provolatile => 'v',
+  proparallel => 'r', prorettype => 'void', proargtypes => 'oid',
+  prosrc => 'binary_upgrade_set_next_pg_subscription_oid' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/test/modules/spgist_name_ops/expected/spgist_name_ops.out b/src/test/modules/spgist_name_ops/expected/spgist_name_ops.out
index 1ee65ede24..39d43368c4 100644
--- a/src/test/modules/spgist_name_ops/expected/spgist_name_ops.out
+++ b/src/test/modules/spgist_name_ops/expected/spgist_name_ops.out
@@ -59,11 +59,12 @@ select * from t
  binary_upgrade_set_next_multirange_pg_type_oid       |  1 | binary_upgrade_set_next_multirange_pg_type_oid
  binary_upgrade_set_next_pg_authid_oid                |    | binary_upgrade_set_next_pg_authid_oid
  binary_upgrade_set_next_pg_enum_oid                  |    | binary_upgrade_set_next_pg_enum_oid
+ binary_upgrade_set_next_pg_subscription_oid          |    | binary_upgrade_set_next_pg_subscription_oid
  binary_upgrade_set_next_pg_tablespace_oid            |    | binary_upgrade_set_next_pg_tablespace_oid
  binary_upgrade_set_next_pg_type_oid                  |    | binary_upgrade_set_next_pg_type_oid
  binary_upgrade_set_next_toast_pg_class_oid           |  1 | binary_upgrade_set_next_toast_pg_class_oid
  binary_upgrade_set_next_toast_relfilenode            |    | binary_upgrade_set_next_toast_relfilenode
-(13 rows)
+(14 rows)
 
 -- Verify clean failure when INCLUDE'd columns result in overlength tuple
 -- The error message details are platform-dependent, so show only SQLSTATE
@@ -108,11 +109,12 @@ select * from t
  binary_upgrade_set_next_multirange_pg_type_oid       |  1 | binary_upgrade_set_next_multirange_pg_type_oid
  binary_upgrade_set_next_pg_authid_oid                |    | binary_upgrade_set_next_pg_authid_oid
  binary_upgrade_set_next_pg_enum_oid                  |    | binary_upgrade_set_next_pg_enum_oid
+ binary_upgrade_set_next_pg_subscription_oid          |    | binary_upgrade_set_next_pg_subscription_oid
  binary_upgrade_set_next_pg_tablespace_oid            |    | binary_upgrade_set_next_pg_tablespace_oid
  binary_upgrade_set_next_pg_type_oid                  |    | binary_upgrade_set_next_pg_type_oid
  binary_upgrade_set_next_toast_pg_class_oid           |  1 | binary_upgrade_set_next_toast_pg_class_oid
  binary_upgrade_set_next_toast_relfilenode            |    | binary_upgrade_set_next_toast_relfilenode
-(13 rows)
+(14 rows)
 
 \set VERBOSITY sqlstate
 insert into t values(repeat('xyzzy', 12), 42, repeat('xyzzy', 4000));
-- 
2.34.1

v20-0001-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v20-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From 6f62b02d3fd5b86dd6f421293e3bc79ee932dfb8 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v20 1/2] Preserve the full subscription's state during
 pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_add_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_add_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init) or
in 'r' (ready) state, and will error out if that's not the case, logging the
reason for the failure.

Author: Vignesh C, Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  50 +++
 src/backend/utils/adt/pg_upgrade_support.c | 125 +++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 227 ++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  17 +
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 187 ++++++++++-
 src/bin/pg_upgrade/info.c                  |  56 +++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 368 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/tools/pgindent/typedefs.list           |   1 +
 13 files changed, 1067 insertions(+), 10 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..8c14047aa5 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,56 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize) or <literal>r</literal> (ready). This
+       can be verified by checking <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..53cfa72b6f 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,22 @@
 
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +312,121 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	AddSubscriptionRelState(subid, relid, relstate, sublsn);
+
+	ReleaseSysCache(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	remote_commit;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 34fd0a86e9..4a4bafba11 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -296,6 +296,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4583,6 +4584,95 @@ is_superuser(Archive *fout)
 	return false;
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * getSubscriptions
  *	  get information about subscriptions
@@ -4609,6 +4699,8 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
+	int			i_subenabled;
 	int			i,
 				ntups;
 
@@ -4664,16 +4756,33 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n");
+	else
+		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n");
+
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " s.subenabled\n");
+	else
+		appendPQExpBufferStr(query, " false AS subenabled\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4700,6 +4809,8 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
+	i_subenabled = PQfnumber(res, "subenabled");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4737,6 +4848,13 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+		subinfo[i].subenabled =
+			pg_strdup(PQgetvalue(res, i, i_subenabled));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4746,6 +4864,76 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade && fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4826,6 +5014,35 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+	{
+		if (subinfo->suboriginremotelsn)
+		{
+			/*
+			 * Preserve the remote_lsn for the subscriber's replication
+			 * origin. This value will be stale if the publisher gets
+			 * upgraded, we don't have a mechanism to distinguish this
+			 * scenario currently. There is no problem even if the remote_lsn
+			 * is updated with a stale value in this case as upgrade ensures
+			 * that all the transactions will be replicated before upgrading
+			 * the publisher.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+			appendStringLiteralAH(query, subinfo->dobj.name, fout);
+			appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+		}
+
+		if (strcmp(subinfo->subenabled, "t") == 0)
+		{
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber's running state.\n");
+			appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s ENABLE;\n", qsubname);
+		}
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10444,6 +10661,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18510,6 +18730,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..7ce34288ea 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -660,6 +661,7 @@ typedef struct _SubscriptionInfo
 {
 	DumpableObject dobj;
 	const char *rolname;
+	char	   *subenabled;
 	char	   *subbinary;
 	char	   *substream;
 	char	   *subtwophasestate;
@@ -671,8 +673,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +712,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +772,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..4d6ae77e2d 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state(void);
 
 
 /*
@@ -112,13 +114,21 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/*
-	 * Logical replication slots can be migrated since PG17. See comments atop
-	 * get_old_cluster_logical_slot_infos().
-	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+	{
+		/*
+		 * Logical replication slots can be migrated since PG17. See comments
+		 * atop get_old_cluster_logical_slot_infos().
+		 */
 		check_old_cluster_for_valid_slots(live_check);
 
+		/*
+		 * Subscription dependencies can be migrated since PG17. See comments
+		 * atop get_db_subscription_count().
+		 */
+		check_old_cluster_subscription_state();
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -237,6 +247,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1538,6 +1550,52 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Logical slots can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated. */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name = 'max_replication_slots';");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1671,124 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize) or r (ready).
+ */
+static void
+check_old_cluster_subscription_state(void)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &old_cluster.dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(&old_cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * A slot not created yet refers to the 'i' (initialize) state, while
+		 * 'r' (ready) state refers to a slot created previously but already
+		 * dropped. These states are supported for pg_upgrade. The other
+		 * states listed below are not supported:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * tracked by the publisher does not match anymore.
+		 *
+		 * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+		 * would retain the replication origin when there is a failure in
+		 * tablesync worker immediately after dropping the replication slot in
+		 * the publisher.
+		 *
+		 * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, n.nspname, c.relname, r.srsubstate "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" state:\"%s\" not in required state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize) or r (ready) state.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..fb8250002f 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,6 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
+static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,10 +294,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_db_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -730,6 +735,55 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_db_subscription_count()
+ *
+ * Gets the number of subscription count of the database.
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_db_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscriptions for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_db_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..0b35afa1b6
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,368 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+sub insert_line_at_pub
+{
+	my $payload = shift;
+
+	foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+	{
+		$publisher->safe_psql('postgres',
+			"INSERT INTO " . $_ . " (val) VALUES('$payload')");
+	}
+}
+
+# Initial setup
+foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+{
+	$publisher->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+	$old_sub->safe_psql('postgres',
+		"CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub FOR TABLE tab_upgraded1");
+
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION regress_pub"
+);
+
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+# After the above wait_for_subscription_sync call the table can be either in
+# 'syncdone' or in 'ready' state. Now wait till the table reaches 'ready' state.
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(2,50), 'before initial sync')"
+);
+$publisher->wait_for_catchup('regress_sub');
+
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+
+# Add tab_upgraded2 to the publication. Now publication has tab_upgraded1
+# and tab_upgraded2 tables.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+
+# Get the subscription oid of the old subscriber
+my $sub_oid =
+  $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub'");
+
+# The tables will be in init state as the subscriber configuration for
+# max_logical_replication_workers is set to 0.
+$synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach init state";
+
+# Get the replication origin remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status WHERE external_id = 'pg_' || $sub_oid"
+);
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub DISABLE");
+
+$old_sub->stop;
+
+# Insert a row in tab_upgraded1 and tab_not_upgraded1 publisher table while
+# it's down.
+insert_line_at_pub('while old_sub is down');
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# Add tab_not_upgraded1 to the publication. Now publication has tab_upgraded1,
+# tab_upgraded2 and tab_not_upgraded1 tables.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub ADD TABLE tab_not_upgraded1");
+
+$new_sub->start;
+
+# The subscription's running status should be preserved
+my $result =
+  $new_sub->safe_psql('postgres',
+	"SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub'");
+is($result, qq(f),
+	"check that the subscriber that was disable on the old subscriber should be disabled in the new subscriber"
+);
+$result =
+  $new_sub->safe_psql('postgres',
+	"SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub1'");
+is($result, qq(t),
+	"check that the subscriber that was enabled on the old subscriber should be enabled in the new subscriber"
+);
+$new_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
+
+# Subscription relations should be preserved. The upgraded subscriber won't know
+# about 'tab_not_upgraded1' because the subscription is not yet refreshed.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2),
+	"there should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+
+# Check the number of rows for each table on each server
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check initial tab_upgraded1 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2), "check initial tab_upgraded2 table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2), "check initial tab_not_upgraded1 table data on publisher");
+
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(50),
+	"check initial tab_upgraded1 table data on the new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(0),
+	"check initial tab_upgraded2 table data on upgraded subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"check initial tab_not_upgraded1 table data on the new subscriber");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub ENABLE");
+
+$publisher->wait_for_catchup('regress_sub');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated, while
+# nothing should happen for tab_not_upgraded1.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(2),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+	"no change in table tab_not_upgraded1 after enable subscription which is not part of the publication"
+);
+
+# Refresh the subscription, the missing row on tab_not_upgraded1 should be
+# replicated.
+$new_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(2),
+	"check replicated inserts on new subscriber after refreshing");
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub1 DISABLE");
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none)");
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+$old_sub->stop;
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+$old_sub->start;
+
+# Drop the subscription
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);
+		INSERT INTO tab_primary_key values(1, 'before initial sync');
+		CREATE PUBLICATION regress_pub2 FOR TABLE tab_primary_key;
+]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);
+		INSERT INTO tab_primary_key values(1, 'before initial sync');
+		CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2;
+]);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub2 WITH (enabled=false)"
+);
+
+my $subid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub3'");
+my $reporigin = 'pg_' . qq($subid);
+
+# Drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/database:\"postgres\" subscription:\"regress_sub2\" schema:\"public\" relation:\"tab_primary_key\" state:\"d\" not in required state/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/replication origin is missing for database:\"postgres\" subscription:\"regress_sub3\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fb58dee3bc..45c681db5e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11396,6 +11396,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 86a9886d4f..e6d994923f 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2662,6 +2662,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#151vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#146)
Re: pg_upgrade and logical replication

On Sat, 25 Nov 2023 at 17:50, Amit Kapila <amit.kapila16@gmail.com> wrote:

2.
+ * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+ * would retain the replication origin in certain cases.

I think this is vague. Can we briefly describe cases where the origins
would be retained?

Modified

3. I think the cases where the publisher is also upgraded restoring
the origin's LSN is of no use. Currently, I can't see a problem with
restoring stale originLSN in such cases as we won't be able to
distinguish during the upgrade but I think we should document it in
the comments somewhere in the patch.

Added comments

These are handled in the v20 version patch attached at:
/messages/by-id/CALDaNm0ST1iSrJLD_CV6hQs=w4GZRCRdftQvQA3cO8Hq3QUvYw@mail.gmail.com

Regards,
Vignesh

#152vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#147)
Re: pg_upgrade and logical replication

On Mon, 27 Nov 2023 at 06:53, Peter Smith <smithpb2250@gmail.com> wrote:

Here are some review comments for patch set v19*

//////

v19-0001.

No comments

///////

v19-0002.

(I saw that both changes below seemed cut/paste from similar
functions, but I will ask the questions anyway).

======
src/backend/commands/subscriptioncmds.c

1.
+/* Potentially set by pg_upgrade_support functions */
+Oid binary_upgrade_next_pg_subscription_oid = InvalidOid;
+

The comment "by pg_upgrade_support functions" seemed a bit vague. IMO
you might as well tell the name of the function that sets this.

SUGGESTION
Potentially set by the pg_upgrade_support function --
binary_upgrade_set_next_pg_subscription_oid().

Modified

~~~

2. CreateSubscription

+ if (!OidIsValid(binary_upgrade_next_pg_subscription_oid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("pg_subscription OID value not set when in binary upgrade mode")));

Doesn't this condition mean some kind of impossible internal error
occurred -- i.e. should this be elog instead of ereport?

This is kind of a sanity check to prevent setting the subscription id
with an invalid oid. This can happen if the server is started in
binary upgrade mode and create subscription is called without calling
binary_upgrade_set_next_pg_subscription_oid.

The comment is handled in the v20 version patch attached at:
/messages/by-id/CALDaNm0ST1iSrJLD_CV6hQs=w4GZRCRdftQvQA3cO8Hq3QUvYw@mail.gmail.com

Regards,
Vignesh

#153Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#150)
1 attachment(s)
Re: pg_upgrade and logical replication

On Tue, Nov 28, 2023 at 4:12 PM vignesh C <vignesh21@gmail.com> wrote:

Few comments on the latest patch:
===========================
1.
+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n");
+ else
+ appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n");
+
+ if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, " s.subenabled\n");
+ else
+ appendPQExpBufferStr(query, " false AS subenabled\n");
+
+ appendPQExpBufferStr(query,
+ "FROM pg_subscription s\n");
+
+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query,
+ "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+ "    ON o.external_id = 'pg_' || s.oid::text \n");

Why 'subenabled' have a check for binary_upgrade but
'suboriginremotelsn' doesn't?

2.
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+ Relation rel;
+ HeapTuple tup;
+ Oid subid;
+ Form_pg_subscription form;
+ char    *subname;
+ Oid relid;
+ char relstate;
+ XLogRecPtr sublsn;
+
+ CHECK_IS_BINARY_UPGRADE;
+
+ /* We must check these things before dereferencing the arguments */
+ if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+ elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is
not allowed");
+
+ subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ relid = PG_GETARG_OID(1);
+ relstate = PG_GETARG_CHAR(2);
+ sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+ tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+ if (!HeapTupleIsValid(tup))
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relation %u does not exist", relid));
+ ReleaseSysCache(tup);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);

Why there is no locking for relation? I see that during subscription
operation, we do acquire AccessShareLock on the relation before adding
a corresponding entry in pg_subscription_rel. See the following code:

CreateSubscription()
{
...
foreach(lc, tables)
{
RangeVar *rv = (RangeVar *) lfirst(lc);
Oid relid;

relid = RangeVarGetRelid(rv, AccessShareLock, false);

/* Check for supported relkind. */
CheckSubscriptionRelkind(get_rel_relkind(relid),
rv->schemaname, rv->relname);

AddSubscriptionRelState(subid, relid, table_state,
InvalidXLogRecPtr);
...
}

3.
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
{
...
...
+ AddSubscriptionRelState(subid, relid, relstate, sublsn);
...
}

I see a problem with directly using this function which is that it
doesn't release locks which means it expects either the caller to
release those locks or postpone to release them at the transaction
end. However, all the other binary_upgrade support functions don't
postpone releasing locks till the transaction ends. I think we should
add an additional parameter to indicate whether we want to release
locks and then pass it true from the binary upgrade support function.

4.
extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
int numTables);
extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);

getSubscriptions() and getSubscriptionTables() are defined in the
opposite order in .c file. I think it is better to change the order in
.c file unless there is a reason for not doing so.

5. At this stage, no need to update/send the 0002 patch, we can look
at it after the main patch is committed. That is anyway not directly
related to the main patch.

Apart from the above, I have modified a few comments and messages in
the attached. Kindly review and include the changes if you are fine
with those.

--
With Regards,
Amit Kapila.

Attachments:

changes_by_amit_1.patch.txttext/plain; charset=US-ASCII; name=changes_by_amit_1.patch.txtDownload
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4a4bafba11..a4e723f922 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -5014,18 +5014,21 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	/*
+	 * In binary-upgrade mode, we allow the replication to continue after the
+	 * upgrade.
+	 */
 	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
 	{
 		if (subinfo->suboriginremotelsn)
 		{
 			/*
 			 * Preserve the remote_lsn for the subscriber's replication
-			 * origin. This value will be stale if the publisher gets
-			 * upgraded, we don't have a mechanism to distinguish this
-			 * scenario currently. There is no problem even if the remote_lsn
-			 * is updated with a stale value in this case as upgrade ensures
-			 * that all the transactions will be replicated before upgrading
-			 * the publisher.
+			 * origin. This value is required to start the replication from the
+			 * position before the upgrade. This value will be stale if the
+			 * publisher gets upgraded before the subscriber node. However,
+			 * this shouldn't be a problem as the upgrade ensures that all the
+			 * transactions were replicated before upgrading the publisher.
 			 */
 			appendPQExpBufferStr(query,
 								 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
@@ -5037,6 +5040,10 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 		if (strcmp(subinfo->subenabled, "t") == 0)
 		{
+			/*
+			 * Enable the subscription to allow the replication to continue
+			 * after the upgrade.
+			 */
 			appendPQExpBufferStr(query,
 								 "\n-- For binary upgrade, must preserve the subscriber's running state.\n");
 			appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s ENABLE;\n", qsubname);
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 4d6ae77e2d..9fd1417f0a 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -123,8 +123,8 @@ check_and_dump_old_cluster(bool live_check)
 		check_old_cluster_for_valid_slots(live_check);
 
 		/*
-		 * Subscription dependencies can be migrated since PG17. See comments
-		 * atop get_db_subscription_count().
+		 * Subscription and its dependencies can be migrated since PG17. See
+		 * comments atop get_db_subscription_count().
 		 */
 		check_old_cluster_subscription_state();
 	}
@@ -1554,7 +1554,8 @@ check_new_cluster_logical_replication_slots(void)
  * check_new_cluster_subscription_configuration()
  *
  * Verify that the max_replication_slots configuration specified is enough for
- * creating the subscriptions.
+ * creating the subscriptions. This is required to create the replication
+ * origin for each subscription.
  */
 static void
 check_new_cluster_subscription_configuration(void)
@@ -1564,7 +1565,7 @@ check_new_cluster_subscription_configuration(void)
 	int			nsubs_on_old;
 	int			max_replication_slots;
 
-	/* Logical slots can be migrated since PG17. */
+	/* Subscriptions and its dependencies can be migrated since PG17. */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
 		return;
 
@@ -1726,15 +1727,18 @@ check_old_cluster_subscription_state(void)
 		}
 
 		/*
-		 * A slot not created yet refers to the 'i' (initialize) state, while
-		 * 'r' (ready) state refers to a slot created previously but already
-		 * dropped. These states are supported for pg_upgrade. The other
+		 * We don't allow upgrade if there is a risk of dangling slot or origin
+		 * corresponding to initial sync after upgrade.
+		 *
+		 * A slot/origin not created yet refers to the 'i' (initialize) state,
+		 * while 'r' (ready) state refers to a slot/origin created previously but
+		 * already dropped. These states are supported for pg_upgrade. The other
 		 * states listed below are not supported:
 		 *
 		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
 		 * would retain a replication slot, which could not be dropped by the
 		 * sync worker spawned after the upgrade because the subscription ID
-		 * tracked by the publisher does not match anymore.
+		 * used for the slot name won't match anymore.
 		 *
 		 * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
 		 * would retain the replication origin when there is a failure in
@@ -1786,6 +1790,7 @@ check_old_cluster_subscription_state(void)
 		fclose(script);
 		pg_log(PG_REPORT, "fatal");
 		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize) or r (ready) state.\n"
+				 "You can allow the initial sync to finish for all relations and then restart the upgrade.\n"
 				 "A list of the problem subscriptions is in the file:\n"
 				 "    %s", output_path);
 	}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index fb8250002f..cc73c0fc0c 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -738,7 +738,7 @@ count_old_cluster_logical_slots(void)
 /*
  * get_db_subscription_count()
  *
- * Gets the number of subscription count of the database.
+ * Gets the number of subscriptions of the database referred to by "dbinfo".
  *
  * Note: This function will not do anything if the old cluster is pre-PG17.
  * This is because before that the logical slots are not upgraded, so we will
#154Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#150)
Re: pg_upgrade and logical replication

Here are some review comments for patch v20-0001

======

1. getSubscriptions

+ if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, " s.subenabled\n");
+ else
+ appendPQExpBufferStr(query, " false AS subenabled\n");

Probably I misunderstood this logic... AFAIK the CREATE SUBSCRIPTION
is normally default *enabled*, so why does this code set default
differently as 'false'. OTOH, if this is some special case default
needed because the subscription upgrade is not supported before PG17
then maybe it needs a comment to explain.

~~~

2. dumpSubscription

+ if (strcmp(subinfo->subenabled, "t") == 0)
+ {
+ appendPQExpBufferStr(query,
+ "\n-- For binary upgrade, must preserve the subscriber's running state.\n");
+ appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s ENABLE;\n", qsubname);
+ }

(this is a bit similar to previous comment)

Probably I misunderstood this logic... but AFAIK the CREATE
SUBSCRIPTION is normally default *enabled*. In the CREATE SUBSCRIPTION
top of this function I did not see any "enabled=xxx" code, so won't
this just default to enabled=true per normal. In other words, what
happens if the subscription being upgraded was already DISABLED -- How
does it remain disabled still after upgrade?

But I saw there is a test case for this so perhaps the code is fine?
Maybe it just needs more explanatory comments for this area?

======
src/bin/pg_upgrade/t/004_subscription.pl

3.
+# The subscription's running status should be preserved
+my $result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub'");
+is($result, qq(f),
+ "check that the subscriber that was disable on the old subscriber
should be disabled in the new subscriber"
+);
+$result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub1'");
+is($result, qq(t),
+ "check that the subscriber that was enabled on the old subscriber
should be enabled in the new subscriber"
+);
+$new_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
+

BEFORE
check that the subscriber that was disable on the old subscriber
should be disabled in the new subscriber

SUGGESTION
check that a subscriber that was disabled on the old subscriber is
disabled on the new subscriber

~

BEFORE
check that the subscriber that was enabled on the old subscriber
should be enabled in the new subscriber

SUGGESTION
check that a subscriber that was enabled on the old subscriber is
enabled on the new subscriber

~~~

4.
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+
+# Check the number of rows for each table on each server

Double blank lines.

~~~

5.
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub1 DISABLE");
+$old_sub->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none)");
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
+

Probably it would be tidier to combine all of those.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#155Peter Smith
smithpb2250@gmail.com
In reply to: Peter Smith (#154)
Re: pg_upgrade and logical replication

On Thu, Nov 30, 2023 at 12:06 PM Peter Smith <smithpb2250@gmail.com> wrote:

Here are some review comments for patch v20-0001

3.
+# The subscription's running status should be preserved
+my $result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub'");
+is($result, qq(f),
+ "check that the subscriber that was disable on the old subscriber
should be disabled in the new subscriber"
+);
+$result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub1'");
+is($result, qq(t),
+ "check that the subscriber that was enabled on the old subscriber
should be enabled in the new subscriber"
+);
+$new_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
+

BEFORE
check that the subscriber that was disable on the old subscriber
should be disabled in the new subscriber

SUGGESTION
check that a subscriber that was disabled on the old subscriber is
disabled on the new subscriber

~

BEFORE
check that the subscriber that was enabled on the old subscriber
should be enabled in the new subscriber

SUGGESTION
check that a subscriber that was enabled on the old subscriber is
enabled on the new subscriber

Oops. I think that should have been "subscription", not "subscriber". i.e.

SUGGESTION
check that a subscription that was disabled on the old subscriber is
disabled on the new subscriber

and

SUGGESTION
check that a subscription that was enabled on the old subscriber is
enabled on the new subscriber

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#156Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Smith (#154)
Re: pg_upgrade and logical replication

On Thu, Nov 30, 2023 at 6:37 AM Peter Smith <smithpb2250@gmail.com> wrote:

Here are some review comments for patch v20-0001

======

1. getSubscriptions

+ if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, " s.subenabled\n");
+ else
+ appendPQExpBufferStr(query, " false AS subenabled\n");

Probably I misunderstood this logic... AFAIK the CREATE SUBSCRIPTION
is normally default *enabled*, so why does this code set default
differently as 'false'. OTOH, if this is some special case default
needed because the subscription upgrade is not supported before PG17
then maybe it needs a comment to explain.

Yes, it is for prior versions. By default subscriptions are restored
disabled even if they are enabled before dump. See docs [1]https://www.postgresql.org/docs/devel/app-pgdump.html for
reasons (When dumping logical replication subscriptions, ..). I don't
think we need a comment here as that is a norm we use at other similar
places where we do version checking. We can argue that there could be
more comments as to why the 'connect' is false and if those are really
required, we should do that as a separate patch.

[1]: https://www.postgresql.org/docs/devel/app-pgdump.html

--
With Regards,
Amit Kapila.

#157Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#153)
Re: pg_upgrade and logical replication

On Wed, Nov 29, 2023 at 3:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

In general, the test cases are a bit complex to understand, so, it
will be difficult to enhance these later. The complexity comes from
the fact that one upgrade test is trying to test multiple things (a)
Enabled/Disabled subscriptions; (b) relation states 'i' and 'r' are
preserved after the upgrade. (c) rows from non-refreshed tables are
not copied, etc. I understand that you may want to cover as many
things possible in one test to have fewer upgrade tests which could
save some time but I think it makes the test somewhat difficult to
understand and enhance. Can we try to split it such that (a) and (b)
are tested in one test and others could be separated out?

Few other comments:
===================
1.
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION
regress_pub"
+);
+
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+# After the above wait_for_subscription_sync call the table can be either in
+# 'syncdone' or in 'ready' state. Now wait till the table reaches
'ready' state.
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";

Can the table be in 'i' state after above test? If not, then above
comment is misleading.

2.
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+ "INSERT INTO tab_upgraded1 VALUES (generate_series(2,50), 'before
initial sync')"
+);
+$publisher->wait_for_catchup('regress_sub');

The previous comment applies to this one as well.

3.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+ "max_logical_replication_workers = 0");
+$old_sub->restart;
+
+# Add tab_upgraded2 to the publication. Now publication has tab_upgraded1
+# and tab_upgraded2 tables.
+$publisher->safe_psql('postgres',
+ "ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");

These two cases for Create and Alter look confusing. I think it would
be better if Alter's case is moved before the comment: "Check that
pg_upgrade is successful when all tables are in ready or in init
state.".

4.
+# Insert a row in tab_upgraded1 and tab_not_upgraded1 publisher table while
+# it's down.
+insert_line_at_pub('while old_sub is down');

Isn't sub routine insert_line_at_pub() inserts in all three tables? If
so, then the above comment seems to be wrong and I think it is better
to explain the intention of this insert.

5.
+my $result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub'");
+is($result, qq(f),
+ "check that the subscriber that was disable on the old subscriber
should be disabled in the new subscriber"
+);
+$result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub1'");
+is($result, qq(t),
+ "check that the subscriber that was enabled on the old subscriber
should be enabled in the new subscriber"
+);

Can't the above be tested with a single query?

6.
+$new_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
+
+# Subscription relations should be preserved. The upgraded subscriber
won't know
+# about 'tab_not_upgraded1' because the subscription is not yet refreshed.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2),
+ "there should be 2 rows in pg_subscription_rel(representing
tab_upgraded1 and tab_upgraded2)"
+);

Here the DROP SUBSCRIPTION looks confusing. Let's try to move it after
the verification of objects after the upgrade.

7.
1.
+sub insert_line_at_pub
+{
+ my $payload = shift;
+
+ foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+ {
+ $publisher->safe_psql('postgres',
+ "INSERT INTO " . $_ . " (val) VALUES('$payload')");
+ }
+}
+
+# Initial setup
+foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+{
+ $publisher->safe_psql('postgres',
+ "CREATE TABLE " . $_ . " (id serial, val text)");
+ $old_sub->safe_psql('postgres',
+ "CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');

This makes the test slightly difficult to understand and we don't seem
to achieve much by writing sub routines.

--
With Regards,
Amit Kapila.

#158vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#153)
1 attachment(s)
Re: pg_upgrade and logical replication

On Wed, 29 Nov 2023 at 15:02, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Nov 28, 2023 at 4:12 PM vignesh C <vignesh21@gmail.com> wrote:

Few comments on the latest patch:
===========================
1.
+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n");
+ else
+ appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n");
+
+ if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, " s.subenabled\n");
+ else
+ appendPQExpBufferStr(query, " false AS subenabled\n");
+
+ appendPQExpBufferStr(query,
+ "FROM pg_subscription s\n");
+
+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query,
+ "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+ "    ON o.external_id = 'pg_' || s.oid::text \n");

Why 'subenabled' have a check for binary_upgrade but
'suboriginremotelsn' doesn't?

Combined these two now.

2.
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+ Relation rel;
+ HeapTuple tup;
+ Oid subid;
+ Form_pg_subscription form;
+ char    *subname;
+ Oid relid;
+ char relstate;
+ XLogRecPtr sublsn;
+
+ CHECK_IS_BINARY_UPGRADE;
+
+ /* We must check these things before dereferencing the arguments */
+ if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+ elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is
not allowed");
+
+ subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ relid = PG_GETARG_OID(1);
+ relstate = PG_GETARG_CHAR(2);
+ sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+ tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+ if (!HeapTupleIsValid(tup))
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relation %u does not exist", relid));
+ ReleaseSysCache(tup);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);

Why there is no locking for relation? I see that during subscription
operation, we do acquire AccessShareLock on the relation before adding
a corresponding entry in pg_subscription_rel. See the following code:

CreateSubscription()
{
...
foreach(lc, tables)
{
RangeVar *rv = (RangeVar *) lfirst(lc);
Oid relid;

relid = RangeVarGetRelid(rv, AccessShareLock, false);

/* Check for supported relkind. */
CheckSubscriptionRelkind(get_rel_relkind(relid),
rv->schemaname, rv->relname);

AddSubscriptionRelState(subid, relid, table_state,
InvalidXLogRecPtr);
...
}

Modified

3.
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
{
...
...
+ AddSubscriptionRelState(subid, relid, relstate, sublsn);
...
}

I see a problem with directly using this function which is that it
doesn't release locks which means it expects either the caller to
release those locks or postpone to release them at the transaction
end. However, all the other binary_upgrade support functions don't
postpone releasing locks till the transaction ends. I think we should
add an additional parameter to indicate whether we want to release
locks and then pass it true from the binary upgrade support function.

Modified

4.
extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
int numTables);
extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);

getSubscriptions() and getSubscriptionTables() are defined in the
opposite order in .c file. I think it is better to change the order in
.c file unless there is a reason for not doing so.

Modified

5. At this stage, no need to update/send the 0002 patch, we can look
at it after the main patch is committed. That is anyway not directly
related to the main patch.

Removed it from this version.

Apart from the above, I have modified a few comments and messages in
the attached. Kindly review and include the changes if you are fine
with those.

Merged them.

The attached v21 version patch has the change for the same.

Regards,
Vignesh

Attachments:

v21-0001-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v21-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From 5fadbe8c9a54855026b1dd870f5d74ef4da20f38 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v21] Preserve the full subscription's state during pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_add_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_add_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init) or
in 'r' (ready) state, and will error out if that's not the case, logging the
reason for the failure.

Author: Vignesh C, Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  50 ++++
 src/backend/catalog/pg_subscription.c      |   9 +-
 src/backend/commands/subscriptioncmds.c    |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c | 129 +++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 231 ++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  17 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 192 ++++++++++++-
 src/bin/pg_upgrade/info.c                  |  56 +++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 309 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/include/catalog/pg_subscription_rel.h  |   2 +-
 src/tools/pgindent/typedefs.list           |   1 +
 16 files changed, 1030 insertions(+), 16 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..8c14047aa5 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,56 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize) or <literal>r</literal> (ready). This
+       can be verified by checking <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d6a978f136..84587c4ecc 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -231,7 +231,7 @@ textarray_to_stringlist(ArrayType *textarray)
  */
 void
 AddSubscriptionRelState(Oid subid, Oid relid, char state,
-						XLogRecPtr sublsn)
+						XLogRecPtr sublsn, bool upgrade)
 {
 	Relation	rel;
 	HeapTuple	tup;
@@ -268,8 +268,11 @@ AddSubscriptionRelState(Oid subid, Oid relid, char state,
 
 	heap_freetuple(tup);
 
-	/* Cleanup. */
-	table_close(rel, NoLock);
+	/*
+	 * Cleanup. In case of binary-upgrade mode release the RowExclusiveLock
+	 * lock taken.
+	 */
+	table_close(rel, upgrade ? RowExclusiveLock : NoLock);
 }
 
 /*
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index edc82c11be..2f905fad41 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -773,7 +773,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 										 rv->schemaname, rv->relname);
 
 				AddSubscriptionRelState(subid, relid, table_state,
-										InvalidXLogRecPtr);
+										InvalidXLogRecPtr, false);
 			}
 
 			/*
@@ -943,7 +943,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 			{
 				AddSubscriptionRelState(sub->oid, relid,
 										copy_data ? SUBREL_STATE_INIT : SUBREL_STATE_READY,
-										InvalidXLogRecPtr);
+										InvalidXLogRecPtr, false);
 				ereport(DEBUG1,
 						(errmsg_internal("table \"%s.%s\" added to subscription \"%s\"",
 										 rv->schemaname, rv->relname, sub->name)));
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..9596a04f9e 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,23 @@
 
 #include "postgres.h"
 
+#include "access/relation.h"
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +313,124 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	subrel;
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	subrel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	rel = relation_open(relid, AccessShareLock);
+	AddSubscriptionRelState(subid, relid, relstate, sublsn, true);
+	relation_close(rel, AccessShareLock);
+
+	ReleaseSysCache(tup);
+	table_close(subrel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	remote_commit;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8c0b5486b9..0a00834aae 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -297,6 +297,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4618,6 +4619,8 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
+	int			i_subenabled;
 	int			i,
 				ntups;
 
@@ -4673,16 +4676,30 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n"
+									" s.subenabled\n");
+	else
+		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n"
+									" false AS subenabled\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4709,6 +4726,8 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
+	i_subenabled = PQfnumber(res, "subenabled");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4746,6 +4765,13 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+		subinfo[i].subenabled =
+			pg_strdup(PQgetvalue(res, i, i_subenabled));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4755,6 +4781,165 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade && fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4835,6 +5020,42 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	/*
+	 * In binary-upgrade mode, we allow the replication to continue after the
+	 * upgrade.
+	 */
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+	{
+		if (subinfo->suboriginremotelsn)
+		{
+			/*
+			 * Preserve the remote_lsn for the subscriber's replication
+			 * origin. This value is required to start the replication from the
+			 * position before the upgrade. This value will be stale if the
+			 * publisher gets upgraded before the subscriber node. However,
+			 * this shouldn't be a problem as the upgrade ensures that all the
+			 * transactions were replicated before upgrading the publisher.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+			appendStringLiteralAH(query, subinfo->dobj.name, fout);
+			appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+		}
+
+		if (strcmp(subinfo->subenabled, "t") == 0)
+		{
+			/*
+			 * Enable the subscription to allow the replication to continue
+			 * after the upgrade.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber's running state.\n");
+			appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s ENABLE;\n", qsubname);
+		}
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10453,6 +10674,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18519,6 +18743,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..7ce34288ea 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -660,6 +661,7 @@ typedef struct _SubscriptionInfo
 {
 	DumpableObject dobj;
 	const char *rolname;
+	char	   *subenabled;
 	char	   *subbinary;
 	char	   *substream;
 	char	   *subtwophasestate;
@@ -671,8 +673,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +712,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +772,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..6bb6432f17 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state(void);
 
 
 /*
@@ -112,13 +114,21 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/*
-	 * Logical replication slots can be migrated since PG17. See comments atop
-	 * get_old_cluster_logical_slot_infos().
-	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+	{
+		/*
+		 * Logical replication slots can be migrated since PG17. See comments
+		 * atop get_old_cluster_logical_slot_infos().
+		 */
 		check_old_cluster_for_valid_slots(live_check);
 
+		/*
+		 * Subscriptions and their dependencies can be migrated since PG17. See
+		 * comments atop get_db_subscription_count().
+		 */
+		check_old_cluster_subscription_state();
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -237,6 +247,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1538,6 +1550,53 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions. This is required to create the replication
+ * origin for each subscription.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Subscriptions and their dependencies can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated. */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name = 'max_replication_slots';");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1672,128 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize) or r (ready).
+ */
+static void
+check_old_cluster_subscription_state(void)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &old_cluster.dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(&old_cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * We don't allow upgrade if there is a risk of dangling slot or origin
+		 * corresponding to initial sync after upgrade.
+		 *
+		 * A slot/origin not created yet refers to the 'i' (initialize) state,
+		 * while 'r' (ready) state refers to a slot/origin created previously but
+		 * already dropped. These states are supported for pg_upgrade. The other
+		 * states listed below are not supported:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * used for the slot name won't match anymore.
+		 *
+		 * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+		 * would retain the replication origin when there is a failure in
+		 * tablesync worker immediately after dropping the replication slot in
+		 * the publisher.
+		 *
+		 * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT s.subname, n.nspname, c.relname, r.srsubstate "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" state:\"%s\" not in required state\n",
+					active_db->db_name,
+					PQgetvalue(res, i, 0),
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize) or r (ready) state.\n"
+				 "You can allow the initial sync to finish for all relations and then restart the upgrade.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..cc73c0fc0c 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,6 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
+static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,10 +294,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_db_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -730,6 +735,55 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_db_subscription_count()
+ *
+ * Gets the number of subscriptions of the database referred to by "dbinfo".
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_db_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscriptions for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_db_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..dc924030bb
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,309 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+# In a VPATH build, we'll be started in the source directory, but we want
+# to run pg_upgrade in the build directory so that any files generated finish
+# in it, like delete_old_cluster.{sh,bat}.
+chdir ${PostgreSQL::Test::Utils::tmp_check};
+
+# Initial setup
+$publisher->safe_psql('postgres', "CREATE TABLE tab_upgraded1(id int)");
+$publisher->safe_psql('postgres', "CREATE TABLE tab_upgraded2(id int)");
+$old_sub->safe_psql('postgres', "CREATE TABLE tab_upgraded1(id int)");
+$old_sub->safe_psql('postgres', "CREATE TABLE tab_upgraded2(id int)");
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+# Create a subscription in enabled state before upgrade
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Pre-setup for preparing subscription table in ready state
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub2 FOR TABLE tab_upgraded1");
+
+# Pre-setup for preparing a subscription in disabled state before upgrade
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2"
+);
+
+# Wait till the table tab_upgraded1 reaches 'ready' state
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))"
+);
+$publisher->wait_for_catchup('regress_sub2');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+
+# Pre-setup for preparing subscription table in init state. Add tab_upgraded2
+# to the publication.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+
+# The table tab_upgraded2 will be in init state as the subscriber configuration
+# for max_logical_replication_workers is set to 0.
+my $result = $old_sub->safe_psql('postgres',
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'"
+);
+is($result, qq(t), "Check that the table is in init state");
+
+# Get the replication origin remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub2'"
+);
+
+# Have the subscription in disabled state before upgrade
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state (tab_upgraded1 table is in ready state and tab_upgraded2 table is
+# in init state).
+# ------------------------------------------------------
+command_ok(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in init/ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# ------------------------------------------------------
+# Check that the data inserted to the publisher when the subscriber is down will
+# be replicated to the new subscriber once the new subscriber is started.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres', "INSERT INTO tab_upgraded1 VALUES(51)");
+$publisher->safe_psql('postgres', "INSERT INTO tab_upgraded2 VALUES(1)");
+
+$new_sub->start;
+
+# The subscription's running status should be preserved
+$result =
+  $new_sub->safe_psql('postgres',
+	"SELECT subenabled FROM pg_subscription ORDER BY subname");
+is($result, qq(t
+f),
+	"check that the subscription's running status are preserved"
+);
+
+my $sub_oid = $new_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+
+# Subscription relations should be preserved
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel WHERE srsubid = $sub_oid");
+is($result, qq(2),
+	"there should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status WHERE external_id = 'pg_' || $sub_oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
+$publisher->wait_for_catchup('regress_sub2');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(1),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+$old_sub->safe_psql(
+	'postgres', qq[
+		ALTER SUBSCRIPTION regress_sub1 DISABLE;
+		ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none);
+		DROP SUBSCRIPTION regress_sub1;
+]);
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+$old_sub->stop;
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+$old_sub->start;
+
+# Drop the subscription
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE PUBLICATION regress_pub3 FOR TABLE tab_primary_key;
+]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3;
+]);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled=false)"
+);
+
+my $subid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
+my $reporigin = 'pg_' . qq($subid);
+
+# Drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/database:\"postgres\" subscription:\"regress_sub3\" schema:\"public\" relation:\"tab_primary_key\" state:\"d\" not in required state/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub4 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/replication origin is missing for database:\"postgres\" subscription:\"regress_sub4\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fb58dee3bc..45c681db5e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11396,6 +11396,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index f5324b710d..62bdba5479 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -81,7 +81,7 @@ typedef struct SubscriptionRelState
 } SubscriptionRelState;
 
 extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
-									XLogRecPtr sublsn);
+									XLogRecPtr sublsn, bool upgrade);
 extern void UpdateSubscriptionRelState(Oid subid, Oid relid, char state,
 									   XLogRecPtr sublsn);
 extern char GetSubscriptionRelState(Oid subid, Oid relid, XLogRecPtr *sublsn);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index d659adbfd6..0168f10348 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2665,6 +2665,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#159vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#154)
Re: pg_upgrade and logical replication

On Thu, 30 Nov 2023 at 06:37, Peter Smith <smithpb2250@gmail.com> wrote:

Here are some review comments for patch v20-0001

======

1. getSubscriptions

+ if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, " s.subenabled\n");
+ else
+ appendPQExpBufferStr(query, " false AS subenabled\n");

Probably I misunderstood this logic... AFAIK the CREATE SUBSCRIPTION
is normally default *enabled*, so why does this code set default
differently as 'false'. OTOH, if this is some special case default
needed because the subscription upgrade is not supported before PG17
then maybe it needs a comment to explain.

No changes needed to be done in this case, explanation for the same is
given at [1]/messages/by-id/CAA4eK1JpWkRBFMDC3wOCK=HzCXg8XT1jH-tWb=b++_8YS2=QSQ@mail.gmail.com

~~~

2. dumpSubscription

+ if (strcmp(subinfo->subenabled, "t") == 0)
+ {
+ appendPQExpBufferStr(query,
+ "\n-- For binary upgrade, must preserve the subscriber's running state.\n");
+ appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s ENABLE;\n", qsubname);
+ }

(this is a bit similar to previous comment)

Probably I misunderstood this logic... but AFAIK the CREATE
SUBSCRIPTION is normally default *enabled*. In the CREATE SUBSCRIPTION
top of this function I did not see any "enabled=xxx" code, so won't
this just default to enabled=true per normal. In other words, what
happens if the subscription being upgraded was already DISABLED -- How
does it remain disabled still after upgrade?

But I saw there is a test case for this so perhaps the code is fine?
Maybe it just needs more explanatory comments for this area?

No changes needed to be done in this case, explanation for the same is
given at [1]/messages/by-id/CAA4eK1JpWkRBFMDC3wOCK=HzCXg8XT1jH-tWb=b++_8YS2=QSQ@mail.gmail.com

======
src/bin/pg_upgrade/t/004_subscription.pl

3.
+# The subscription's running status should be preserved
+my $result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub'");
+is($result, qq(f),
+ "check that the subscriber that was disable on the old subscriber
should be disabled in the new subscriber"
+);
+$result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub1'");
+is($result, qq(t),
+ "check that the subscriber that was enabled on the old subscriber
should be enabled in the new subscriber"
+);
+$new_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
+

BEFORE
check that the subscriber that was disable on the old subscriber
should be disabled in the new subscriber

SUGGESTION
check that a subscriber that was disabled on the old subscriber is
disabled on the new subscriber
~

BEFORE
check that the subscriber that was enabled on the old subscriber
should be enabled in the new subscriber

SUGGESTION
check that a subscriber that was enabled on the old subscriber is
enabled on the new subscriber

These statements are combined now

~~~

4.
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+
+# Check the number of rows for each table on each server

Double blank lines.

Modified

~~~

5.
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub1 DISABLE");
+$old_sub->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none)");
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
+

Probably it would be tidier to combine all of those.

Modified

The changes for the same is present in the v21 version patch attached at [2]/messages/by-id/CALDaNm37E4tmSZd+k1ixtKevX3eucmhdOnw4pGmykZk4C1Nm4Q@mail.gmail.com

[1]: /messages/by-id/CAA4eK1JpWkRBFMDC3wOCK=HzCXg8XT1jH-tWb=b++_8YS2=QSQ@mail.gmail.com
[2]: /messages/by-id/CALDaNm37E4tmSZd+k1ixtKevX3eucmhdOnw4pGmykZk4C1Nm4Q@mail.gmail.com

Regards,
Vignesh

#160vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#157)
Re: pg_upgrade and logical replication

On Thu, 30 Nov 2023 at 13:35, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 29, 2023 at 3:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

In general, the test cases are a bit complex to understand, so, it
will be difficult to enhance these later. The complexity comes from
the fact that one upgrade test is trying to test multiple things (a)
Enabled/Disabled subscriptions; (b) relation states 'i' and 'r' are
preserved after the upgrade. (c) rows from non-refreshed tables are
not copied, etc. I understand that you may want to cover as many
things possible in one test to have fewer upgrade tests which could
save some time but I think it makes the test somewhat difficult to
understand and enhance. Can we try to split it such that (a) and (b)
are tested in one test and others could be separated out?

Yes, I had tried to combine a few tests as it was taking more time to
run. I have refactored the tests by removing tab_not_upgraded1 related
test which is more of a logical replication test, adding more
comments, removing intermediate select count checks. So now we have
test1) which checks for upgrade with subscriber having table in
init/ready state, test2) Check that the data inserted to the publisher
when the subscriber is down will be replicated to the new subscriber
once the new subscriber is started (these are done as continuation of
the previous test). test3) Check that pg_upgrade fails when
max_replication_slots configured in the new cluster is less than the
number of subscriptions in the old cluster. test4) Check upgrade fails
with old instance with relation in 'd' datasync(invalid) state and
missing replication origin.
In test4 I have combined both datasync relation state and missing
replication origin as the validation for both is in the same file. I
felt the readability is better now, do let me know if any of the test
is still difficult to understand.

Few other comments:
===================
1.
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION
regress_pub"
+);
+
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+# After the above wait_for_subscription_sync call the table can be either in
+# 'syncdone' or in 'ready' state. Now wait till the table reaches
'ready' state.
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";

Can the table be in 'i' state after above test? If not, then above
comment is misleading.

This part of test is to get the table in ready state/ modified the
comments appropriately

2.
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+ "INSERT INTO tab_upgraded1 VALUES (generate_series(2,50), 'before
initial sync')"
+);
+$publisher->wait_for_catchup('regress_sub');

The previous comment applies to this one as well.

I have removed this comment and moved it before the upgrade command as
it is more appropriate there.

3.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+ "max_logical_replication_workers = 0");
+$old_sub->restart;
+
+# Add tab_upgraded2 to the publication. Now publication has tab_upgraded1
+# and tab_upgraded2 tables.
+$publisher->safe_psql('postgres',
+ "ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");

These two cases for Create and Alter look confusing. I think it would
be better if Alter's case is moved before the comment: "Check that
pg_upgrade is successful when all tables are in ready or in init
state.".

I have added more comments to make it clear now. I have moved the
"check that pg_upgrade is successful when all tables ..." before the
upgrade command to be more clearer. Added comment "Pre-setup for
preparing subscription table in init state. Add tab_upgraded2 to the
publication." and "# The table tab_upgraded2 will be in init state as
the subscriber configuration for max_logical_replication_workers is
set to 0."

4.
+# Insert a row in tab_upgraded1 and tab_not_upgraded1 publisher table while
+# it's down.
+insert_line_at_pub('while old_sub is down');

Isn't sub routine insert_line_at_pub() inserts in all three tables? If
so, then the above comment seems to be wrong and I think it is better
to explain the intention of this insert.

Modified

5.
+my $result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub'");
+is($result, qq(f),
+ "check that the subscriber that was disable on the old subscriber
should be disabled in the new subscriber"
+);
+$result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub1'");
+is($result, qq(t),
+ "check that the subscriber that was enabled on the old subscriber
should be enabled in the new subscriber"
+);

Can't the above be tested with a single query?

Modified

6.
+$new_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
+
+# Subscription relations should be preserved. The upgraded subscriber
won't know
+# about 'tab_not_upgraded1' because the subscription is not yet refreshed.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2),
+ "there should be 2 rows in pg_subscription_rel(representing
tab_upgraded1 and tab_upgraded2)"
+);

Here the DROP SUBSCRIPTION looks confusing. Let's try to move it after
the verification of objects after the upgrade.

I have removed this now, no need to move it to down as we will be
stopping the newsub server at the end of this test and this newsub
will not be used later.

7.
1.
+sub insert_line_at_pub
+{
+ my $payload = shift;
+
+ foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+ {
+ $publisher->safe_psql('postgres',
+ "INSERT INTO " . $_ . " (val) VALUES('$payload')");
+ }
+}
+
+# Initial setup
+foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+{
+ $publisher->safe_psql('postgres',
+ "CREATE TABLE " . $_ . " (id serial, val text)");
+ $old_sub->safe_psql('postgres',
+ "CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');

This makes the test slightly difficult to understand and we don't seem
to achieve much by writing sub routines.

Removed the subroutines.

The changes for the same is available at:
/messages/by-id/CALDaNm37E4tmSZd+k1ixtKevX3eucmhdOnw4pGmykZk4C1Nm4Q@mail.gmail.com

Regards,
Vignesh

#161Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#158)
Re: pg_upgrade and logical replication

Here are review comments for patch v21-0001

======
src/bin/pg_upgrade/check.c

1. check_old_cluster_subscription_state

+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize) or r (ready).
+ */
+static void
+check_old_cluster_subscription_state(void)

Function comment should also mention it also validates the origin.

~~~

2.
In this function there are a couple of errors written to the
"subs_invalid.txt" file:

+ fprintf(script, "replication origin is missing for database:\"%s\"
subscription:\"%s\"\n",
+ PQgetvalue(res, i, 0),
+ PQgetvalue(res, i, 1));

and

+ fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\"
relation:\"%s\" state:\"%s\" not in required state\n",
+ active_db->db_name,
+ PQgetvalue(res, i, 0),
+ PQgetvalue(res, i, 1),
+ PQgetvalue(res, i, 2),
+ PQgetvalue(res, i, 3));

The format of those messages is not consistent. It could be improved
in a number of ways to make them more similar. e.g. below.

SUGGESTION #1
the replication origin is missing for database:\"%s\" subscription:\"%s\"\n
the table sync state \"%s\" is not allowed for database:\"%s\"
subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n

SUGGESTION #2
database:\"%s\" subscription:\"%s\" -- replication origin is missing\n
database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" --
upgrade when table sync state is \"%s\" is not supported\n

etc.

======
src/bin/pg_upgrade/t/004_subscription.pl

3.
+# Initial setup
+$publisher->safe_psql('postgres', "CREATE TABLE tab_upgraded1(id int)");
+$publisher->safe_psql('postgres', "CREATE TABLE tab_upgraded2(id int)");
+$old_sub->safe_psql('postgres', "CREATE TABLE tab_upgraded1(id int)");
+$old_sub->safe_psql('postgres', "CREATE TABLE tab_upgraded2(id int)");

IMO it is tidier to combine multiple DDLS whenever you can.

~~~

4.
+# Create a subscription in enabled state before upgrade
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');

That publication has an empty set of tables. Should there be some
comment to explain why it is OK like this?

~~~

5.
+# Wait till the table tab_upgraded1 reaches 'ready' state
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+
+$publisher->safe_psql('postgres',
+ "INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))"
+);
+$publisher->wait_for_catchup('regress_sub2');

IMO better without the blank line, so then everything more clearly
belongs to this same comment.

~~~

6.
+# Pre-setup for preparing subscription table in init state. Add tab_upgraded2
+# to the publication.
+$publisher->safe_psql('postgres',
+ "ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");

Ditto. IMO better without the blank line, so then everything more
clearly belongs to this same comment.

~~~

7.
+command_ok(
+ [
+ 'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+ '-D', $new_sub->data_dir, '-b', $oldbindir,
+ '-B', $newbindir, '-s', $new_sub->host,
+ '-p', $old_sub->port, '-P', $new_sub->port,
+ $mode
+ ],
+ 'run of pg_upgrade for old instance when the subscription tables are
in init/ready state'
+);

Maybe those 'command_ok' args can be formatted neatly (like you've
done later for the 'command_checks_all').

~~~

8.
+# ------------------------------------------------------
+# Check that the data inserted to the publisher when the subscriber
is down will
+# be replicated to the new subscriber once the new subscriber is started.
+# ------------------------------------------------------

8a.
SUGGESTION
...when the new subscriber is down will be replicated once it is started.

~

8b.
I thought this main comment should also say something like "Also check
that the old subscription states and relations origins are all
preserved."

~~~

9.
+$publisher->safe_psql('postgres', "INSERT INTO tab_upgraded1 VALUES(51)");
+$publisher->safe_psql('postgres', "INSERT INTO tab_upgraded2 VALUES(1)");

IMO it is tidier to combine multiple DDLS whenever you can.

~~~

10.
+# The subscription's running status should be preserved
+$result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription ORDER BY subname");
+is($result, qq(t
+f),
+ "check that the subscription's running status are preserved"
+);

I felt this was a bit too tricky. It might be more readable to do 2
separate SELECTs with explicit subnames. Alternatively, leave the code
as-is but improve the comment to explicitly say something like:

# old subscription regress_sub was enabled
# old subscription regress_sub1 was disabled

~~~

11.
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+$old_sub->stop;

/than number/than the number/

Should that old_sub->stop have been part of the previous cleanup steps?

~~~

12.
+$old_sub->start;
+
+# Drop the subscription
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");

Maybe it is tidier puttin that 'start' below the comment.

~~~

13.
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------

13a.
/refuses to run in:/refuses to run if:/

~

13b.
/a) if/a)/

~

13c.
/b) if/b)/

~~~

14.
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION
regress_pub3 WITH (enabled=false)"
+);
+
+my $subid = $old_sub->safe_psql('postgres',
+ "SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
+my $reporigin = 'pg_' . qq($subid);
+
+# Drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+ "SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;

14a.
IMO better to have all this without blank lines, because it all
belongs to the first comment.

~

14b.
That 2nd comment "# Drop the..." is not required because the first
comment already says the same.

======
src/include/catalog/pg_subscription_rel.h

15.
 extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
- XLogRecPtr sublsn);
+ XLogRecPtr sublsn, bool upgrade);

Shouldn't this 'upgrade' really be 'binary_upgrade' so it better
matches the comment you added in that function?

If you agree, then change it here and also in the function definition.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#162Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Smith (#161)
Re: pg_upgrade and logical replication

On Fri, Dec 1, 2023 at 10:57 AM Peter Smith <smithpb2250@gmail.com> wrote:

Here are review comments for patch v21-0001

2.
In this function there are a couple of errors written to the
"subs_invalid.txt" file:

+ fprintf(script, "replication origin is missing for database:\"%s\"
subscription:\"%s\"\n",
+ PQgetvalue(res, i, 0),
+ PQgetvalue(res, i, 1));

and

+ fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\"
relation:\"%s\" state:\"%s\" not in required state\n",
+ active_db->db_name,
+ PQgetvalue(res, i, 0),
+ PQgetvalue(res, i, 1),
+ PQgetvalue(res, i, 2),
+ PQgetvalue(res, i, 3));

The format of those messages is not consistent. It could be improved
in a number of ways to make them more similar. e.g. below.

SUGGESTION #1
the replication origin is missing for database:\"%s\" subscription:\"%s\"\n
the table sync state \"%s\" is not allowed for database:\"%s\"
subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n

+1. Shall we keep 'the' as 'The' in the message? Few other messages in
the same file start with capital letter.

4.
+# Create a subscription in enabled state before upgrade
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');

That publication has an empty set of tables. Should there be some
comment to explain why it is OK like this?

I think we can add a comment to state the intention of overall test
which this is part of.

10.
+# The subscription's running status should be preserved
+$result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription ORDER BY subname");
+is($result, qq(t
+f),
+ "check that the subscription's running status are preserved"
+);

I felt this was a bit too tricky. It might be more readable to do 2
separate SELECTs with explicit subnames. Alternatively, leave the code
as-is but improve the comment to explicitly say something like:

# old subscription regress_sub was enabled
# old subscription regress_sub1 was disabled

I don't see the need to have separate queries though adding comments
is a good idea.

15.
extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
- XLogRecPtr sublsn);
+ XLogRecPtr sublsn, bool upgrade);

Shouldn't this 'upgrade' really be 'binary_upgrade' so it better
matches the comment you added in that function?

It is better to name this parameter as retain_lock and then explain it
in the function header. The bigger problem with change is that we
should release the other lock
(LockSharedObject(SubscriptionRelationId, subid, 0, AccessShareLock);)
taken in the function as well.

--
With Regards,
Amit Kapila.

#163vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#161)
1 attachment(s)
Re: pg_upgrade and logical replication

On Fri, 1 Dec 2023 at 10:57, Peter Smith <smithpb2250@gmail.com> wrote:

Here are review comments for patch v21-0001

======
src/bin/pg_upgrade/check.c

1. check_old_cluster_subscription_state

+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize) or r (ready).
+ */
+static void
+check_old_cluster_subscription_state(void)

Function comment should also mention it also validates the origin.

Modified

~~~

2.
In this function there are a couple of errors written to the
"subs_invalid.txt" file:

+ fprintf(script, "replication origin is missing for database:\"%s\"
subscription:\"%s\"\n",
+ PQgetvalue(res, i, 0),
+ PQgetvalue(res, i, 1));

and

+ fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\"
relation:\"%s\" state:\"%s\" not in required state\n",
+ active_db->db_name,
+ PQgetvalue(res, i, 0),
+ PQgetvalue(res, i, 1),
+ PQgetvalue(res, i, 2),
+ PQgetvalue(res, i, 3));

The format of those messages is not consistent. It could be improved
in a number of ways to make them more similar. e.g. below.

SUGGESTION #1
the replication origin is missing for database:\"%s\" subscription:\"%s\"\n
the table sync state \"%s\" is not allowed for database:\"%s\"
subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n

SUGGESTION #2
database:\"%s\" subscription:\"%s\" -- replication origin is missing\n
database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" --
upgrade when table sync state is \"%s\" is not supported\n

etc.

Modified based on SUGGESTION#1

======
src/bin/pg_upgrade/t/004_subscription.pl

3.
+# Initial setup
+$publisher->safe_psql('postgres', "CREATE TABLE tab_upgraded1(id int)");
+$publisher->safe_psql('postgres', "CREATE TABLE tab_upgraded2(id int)");
+$old_sub->safe_psql('postgres', "CREATE TABLE tab_upgraded1(id int)");
+$old_sub->safe_psql('postgres', "CREATE TABLE tab_upgraded2(id int)");

IMO it is tidier to combine multiple DDLS whenever you can.

Modified

~~~

4.
+# Create a subscription in enabled state before upgrade
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');

That publication has an empty set of tables. Should there be some
comment to explain why it is OK like this?

This test is just to verify that the enabled subscriptions will be
enabled after upgrade, we don't need data for this. Data validation
happens with a different subscriptin. Modified comments

~~~

5.
+# Wait till the table tab_upgraded1 reaches 'ready' state
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+
+$publisher->safe_psql('postgres',
+ "INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))"
+);
+$publisher->wait_for_catchup('regress_sub2');

IMO better without the blank line, so then everything more clearly
belongs to this same comment.

Modified

~~~

6.
+# Pre-setup for preparing subscription table in init state. Add tab_upgraded2
+# to the publication.
+$publisher->safe_psql('postgres',
+ "ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");

Ditto. IMO better without the blank line, so then everything more
clearly belongs to this same comment.

Modified

~~~

7.
+command_ok(
+ [
+ 'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+ '-D', $new_sub->data_dir, '-b', $oldbindir,
+ '-B', $newbindir, '-s', $new_sub->host,
+ '-p', $old_sub->port, '-P', $new_sub->port,
+ $mode
+ ],
+ 'run of pg_upgrade for old instance when the subscription tables are
in init/ready state'
+);

Maybe those 'command_ok' args can be formatted neatly (like you've
done later for the 'command_checks_all').

This is based on the run from pgperlytidy. Even if i format it
pgperltidy reverts the formatting that I have done. I have seen the
same is the case with other upgrade commands in few places. So not
making any changes for this.

~~~

8.
+# ------------------------------------------------------
+# Check that the data inserted to the publisher when the subscriber
is down will
+# be replicated to the new subscriber once the new subscriber is started.
+# ------------------------------------------------------

8a.
SUGGESTION
...when the new subscriber is down will be replicated once it is started.

Modified

~

8b.
I thought this main comment should also say something like "Also check
that the old subscription states and relations origins are all
preserved."

Modified

~~~

9.
+$publisher->safe_psql('postgres', "INSERT INTO tab_upgraded1 VALUES(51)");
+$publisher->safe_psql('postgres', "INSERT INTO tab_upgraded2 VALUES(1)");

IMO it is tidier to combine multiple DDLS whenever you can.

Modified

~~~

10.
+# The subscription's running status should be preserved
+$result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription ORDER BY subname");
+is($result, qq(t
+f),
+ "check that the subscription's running status are preserved"
+);

I felt this was a bit too tricky. It might be more readable to do 2
separate SELECTs with explicit subnames. Alternatively, leave the code
as-is but improve the comment to explicitly say something like:

# old subscription regress_sub was enabled
# old subscription regress_sub1 was disabled

Modified to add comments.

~~~

11.
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+$old_sub->stop;

/than number/than the number/

Should that old_sub->stop have been part of the previous cleanup steps?

Modified

~~~

12.
+$old_sub->start;
+
+# Drop the subscription
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");

Maybe it is tidier puttin that 'start' below the comment.

Modified

~~~

13.
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------

13a.
/refuses to run in:/refuses to run if:/

Modified

~

13b.
/a) if/a)/

Modified

~

13c.
/b) if/b)/

Modified

~~~

14.
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION
regress_pub3 WITH (enabled=false)"
+);
+
+my $subid = $old_sub->safe_psql('postgres',
+ "SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
+my $reporigin = 'pg_' . qq($subid);
+
+# Drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+ "SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;

14a.
IMO better to have all this without blank lines, because it all
belongs to the first comment.

Modified

14b.
That 2nd comment "# Drop the..." is not required because the first
comment already says the same.

Modified

======
src/include/catalog/pg_subscription_rel.h

15.
extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
- XLogRecPtr sublsn);
+ XLogRecPtr sublsn, bool upgrade);

Shouldn't this 'upgrade' really be 'binary_upgrade' so it better
matches the comment you added in that function?

If you agree, then change it here and also in the function definition.

Modified it to retain_lock based on suggestions from [1]/messages/by-id/CAA4eK1KFEHhJEo43k_qUpC0Eod34zVq=Kae34koEDrPFXzeeJg@mail.gmail.com

The attached v22 version patch has the changes for the same.

[1]: /messages/by-id/CAA4eK1KFEHhJEo43k_qUpC0Eod34zVq=Kae34koEDrPFXzeeJg@mail.gmail.com

Regards,
Vignesh

Attachments:

v22-0001-Preserve-the-full-subscription-s-state-during-pg.patchtext/x-patch; charset=US-ASCII; name=v22-0001-Preserve-the-full-subscription-s-state-during-pg.patchDownload
From 8ea0711eacf41052465c30442352b1c23eb3a6dc Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v22] Preserve the full subscription's state during pg_upgrade

Previously, only the subscription metadata information was preserved.  Without
the list of relations and their state it's impossible to re-enable the
subscriptions without missing some records as the list of relations can only be
refreshed after enabling the subscription (and therefore starting the apply
worker).  Even if we added a way to refresh the subscription while enabling a
publication, we still wouldn't know which relations are new on the publication
side, and therefore should be fully synced, and which shouldn't.

To fix this problem, this patch teaches pg_dump to restore the content of
pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The new SQL binary_upgrade_add_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_add_sub_rel_state(subname text, relid oid, state char [,sublsn pg_lsn])

In the above, subname is the subscription name, relid is the relation
identifier, the state is the state of the relation, sublsn is subscription lsn
which is optional, and defaults to NULL/InvalidXLogRecPtr if not provided.
pg_dump will retrieve these values(subname, relid, state and sublsn) from the
old cluster.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To fix this problem, this patch teaches pg_dump to update the replication
origin along with create subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

The new SQL binary_upgrade_replorigin_advance function has the following
syntax:
SELECT binary_upgrade_replorigin_advance(subname text, sublsn pg_lsn)

In the above, subname is the subscription name and sublsn is subscription lsn.
pg_dump will retrieve these values(subname and sublsn) from the old cluster.

pg_upgrade will check that all the subscription relations are in 'i' (init) or
in 'r' (ready) state, and will error out if that's not the case, logging the
reason for the failure.

Author: Vignesh C, Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  50 ++++
 src/backend/catalog/pg_subscription.c      |  11 +-
 src/backend/commands/subscriptioncmds.c    |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c | 129 +++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 232 ++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  17 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 193 ++++++++++++-
 src/bin/pg_upgrade/info.c                  |  56 +++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 313 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/include/catalog/pg_subscription_rel.h  |   2 +-
 src/tools/pgindent/typedefs.list           |   1 +
 16 files changed, 1039 insertions(+), 15 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..8c14047aa5 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,56 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize) or <literal>r</literal> (ready). This
+       can be verified by checking <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d6a978f136..f8f62892d6 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -228,10 +228,14 @@ textarray_to_stringlist(ArrayType *textarray)
 
 /*
  * Add new state record for a subscription table.
+ *
+ * If retain_lock is true, then don't release the locks taken on
+ * pg_subscription and pg_subscription_rel tables. retain_lock will be true in
+ * case of non binary-upgrade mode.
  */
 void
 AddSubscriptionRelState(Oid subid, Oid relid, char state,
-						XLogRecPtr sublsn)
+						XLogRecPtr sublsn, bool retain_lock)
 {
 	Relation	rel;
 	HeapTuple	tup;
@@ -269,7 +273,10 @@ AddSubscriptionRelState(Oid subid, Oid relid, char state,
 	heap_freetuple(tup);
 
 	/* Cleanup. */
-	table_close(rel, NoLock);
+	table_close(rel, retain_lock ? NoLock : RowExclusiveLock);
+
+	if (!retain_lock)
+		UnlockSharedObject(SubscriptionRelationId, subid, 0, AccessShareLock);
 }
 
 /*
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index edc82c11be..dd067d39ad 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -773,7 +773,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 										 rv->schemaname, rv->relname);
 
 				AddSubscriptionRelState(subid, relid, table_state,
-										InvalidXLogRecPtr);
+										InvalidXLogRecPtr, true);
 			}
 
 			/*
@@ -943,7 +943,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 			{
 				AddSubscriptionRelState(sub->oid, relid,
 										copy_data ? SUBREL_STATE_INIT : SUBREL_STATE_READY,
-										InvalidXLogRecPtr);
+										InvalidXLogRecPtr, true);
 				ereport(DEBUG1,
 						(errmsg_internal("table \"%s.%s\" added to subscription \"%s\"",
 										 rv->schemaname, rv->relname, sub->name)));
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..82a6b2267f 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,23 @@
 
 #include "postgres.h"
 
+#include "access/relation.h"
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +313,124 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	subrel;
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	subrel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	rel = relation_open(relid, AccessShareLock);
+	AddSubscriptionRelState(subid, relid, relstate, sublsn, false);
+	relation_close(rel, AccessShareLock);
+
+	ReleaseSysCache(tup);
+	table_close(subrel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	remote_commit;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8c0b5486b9..de4ca03b8b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -297,6 +297,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4618,6 +4619,8 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
+	int			i_subenabled;
 	int			i,
 				ntups;
 
@@ -4673,16 +4676,30 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n"
+							 " s.subenabled\n");
+	else
+		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n"
+							 " false AS subenabled\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4709,6 +4726,8 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
+	i_subenabled = PQfnumber(res, "subenabled");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4746,6 +4765,13 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+		subinfo[i].subenabled =
+			pg_strdup(PQgetvalue(res, i, i_subenabled));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4755,6 +4781,165 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade && fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4835,6 +5020,43 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	/*
+	 * In binary-upgrade mode, we allow the replication to continue after the
+	 * upgrade.
+	 */
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+	{
+		if (subinfo->suboriginremotelsn)
+		{
+			/*
+			 * Preserve the remote_lsn for the subscriber's replication
+			 * origin. This value is required to start the replication from
+			 * the position before the upgrade. This value will be stale if
+			 * the publisher gets upgraded before the subscriber node.
+			 * However, this shouldn't be a problem as the upgrade ensures
+			 * that all the transactions were replicated before upgrading the
+			 * publisher.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+			appendStringLiteralAH(query, subinfo->dobj.name, fout);
+			appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+		}
+
+		if (strcmp(subinfo->subenabled, "t") == 0)
+		{
+			/*
+			 * Enable the subscription to allow the replication to continue
+			 * after the upgrade.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber's running state.\n");
+			appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s ENABLE;\n", qsubname);
+		}
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10453,6 +10675,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18519,6 +18744,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..7ce34288ea 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,
 } DumpableObjectType;
 
 /*
@@ -660,6 +661,7 @@ typedef struct _SubscriptionInfo
 {
 	DumpableObject dobj;
 	const char *rolname;
+	char	   *subenabled;
 	char	   *subbinary;
 	char	   *substream;
 	char	   *subtwophasestate;
@@ -671,8 +673,21 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +712,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +772,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..f7a8114021 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state(void);
 
 
 /*
@@ -112,13 +114,21 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/*
-	 * Logical replication slots can be migrated since PG17. See comments atop
-	 * get_old_cluster_logical_slot_infos().
-	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+	{
+		/*
+		 * Logical replication slots can be migrated since PG17. See comments
+		 * atop get_old_cluster_logical_slot_infos().
+		 */
 		check_old_cluster_for_valid_slots(live_check);
 
+		/*
+		 * Subscriptions and their dependencies can be migrated since PG17.
+		 * See comments atop get_db_subscription_count().
+		 */
+		check_old_cluster_subscription_state();
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -237,6 +247,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1538,6 +1550,53 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions. This is required to create the replication
+ * origin for each subscription.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Subscriptions and their dependencies can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated. */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name = 'max_replication_slots';");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1672,129 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize) or r (ready). Also validate that all the subscriptions have
+ * their respective replication origin.
+ */
+static void
+check_old_cluster_subscription_state(void)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &old_cluster.dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(&old_cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "The replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * We don't allow upgrade if there is a risk of dangling slot or
+		 * origin corresponding to initial sync after upgrade.
+		 *
+		 * A slot/origin not created yet refers to the 'i' (initialize) state,
+		 * while 'r' (ready) state refers to a slot/origin created previously
+		 * but already dropped. These states are supported for pg_upgrade. The
+		 * other states listed below are not supported:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * used for the slot name won't match anymore.
+		 *
+		 * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+		 * would retain the replication origin when there is a failure in
+		 * tablesync worker immediately after dropping the replication slot in
+		 * the publisher.
+		 *
+		 * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT r.srsubstate, s.subname, n.nspname, c.relname "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "The table sync state \"%s\" is not allowed for database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n",
+					PQgetvalue(res, i, 0),
+					active_db->db_name,
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize) or r (ready) state.\n"
+				 "You can allow the initial sync to finish for all relations and then restart the upgrade.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..cc73c0fc0c 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,6 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
+static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,10 +294,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_db_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -730,6 +735,55 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_db_subscription_count()
+ *
+ * Gets the number of subscriptions of the database referred to by "dbinfo".
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_db_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscriptions for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_db_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..b1eadd8a0d
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,313 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+# In a VPATH build, we'll be started in the source directory, but we want
+# to run pg_upgrade in the build directory so that any files generated finish
+# in it, like delete_old_cluster.{sh,bat}.
+chdir ${PostgreSQL::Test::Utils::tmp_check};
+
+# Initial setup
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE TABLE tab_upgraded2(id int);
+]);
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE TABLE tab_upgraded2(id int);
+]);
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+# Setup an enabled subscription to verify that enabled subscription is retained
+# as enabled after upgrade.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Pre-setup for verifying upgrade should be successful with tables in
+# 'ready'/'init' state along with retaining the replication origin remote lsn
+# and subscription running status.
+# Setup an subscription with:
+# a) table tab_upgraded1 in 'ready' state
+# b) table tab_upgraded2 in 'init' state
+# c) a valid remote_lsn for its replication origin
+# d) disabled state
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub2 FOR TABLE tab_upgraded1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2"
+);
+# a) Wait till the table tab_upgraded1 reaches 'ready' state
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))");
+$publisher->wait_for_catchup('regress_sub2');
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+# Pre-setup for preparing subscription table in init state. Add tab_upgraded2
+# to the publication.
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+# b) The table tab_upgraded2 will be in init state as the subscriber
+# configuration for max_logical_replication_workers is set to 0.
+my $result = $old_sub->safe_psql('postgres',
+	"SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
+is($result, qq(t), "Check that the table is in init state");
+# c) Get the replication origin remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub2'"
+);
+# d) Have the subscription in disabled state before upgrade
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state (tab_upgraded1 table is in ready state and tab_upgraded2 table is
+# in init state) along with retaining the replication origin remote lsn
+# and subscription running status.
+# ------------------------------------------------------
+command_ok(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in init/ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# ------------------------------------------------------
+# Check that the data inserted to the publisher when the new subscriber is down
+# will be replicated once it is started. Also check that the old subscription
+# states and relations origins are all preserved.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		INSERT INTO tab_upgraded1 VALUES(51);
+		INSERT INTO tab_upgraded2 VALUES(1);
+]);
+
+$new_sub->start;
+
+# The subscription's running status should be preserved. Old subscription
+# regress_sub1 should be enabled and old subscription regress_sub2 should be
+# disabled.
+$result =
+  $new_sub->safe_psql('postgres',
+	"SELECT subenabled FROM pg_subscription ORDER BY subname");
+is( $result, qq(t
+f),
+	"check that the subscription's running status are preserved");
+
+my $sub_oid = $new_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+
+# Subscription relations should be preserved
+$result =
+  $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM pg_subscription_rel WHERE srsubid = $sub_oid");
+is($result, qq(2),
+	"there should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status WHERE external_id = 'pg_' || $sub_oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
+$publisher->wait_for_catchup('regress_sub2');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(1),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+$old_sub->safe_psql(
+	'postgres', qq[
+		ALTER SUBSCRIPTION regress_sub1 DISABLE;
+		ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none);
+		DROP SUBSCRIPTION regress_sub1;
+]);
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+# Drop the subscription
+$old_sub->start;
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if:
+# a) there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE PUBLICATION regress_pub3 FOR TABLE tab_primary_key;
+]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3;
+]);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled=false)"
+);
+my $subid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
+my $reporigin = 'pg_' . qq($subid);
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub3\" schema:\"public\" relation:\"tab_primary_key\"/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub4 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub4\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fb58dee3bc..45c681db5e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11396,6 +11396,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index f5324b710d..34ec3117a3 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -81,7 +81,7 @@ typedef struct SubscriptionRelState
 } SubscriptionRelState;
 
 extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
-									XLogRecPtr sublsn);
+									XLogRecPtr sublsn, bool retain_lock);
 extern void UpdateSubscriptionRelState(Oid subid, Oid relid, char state,
 									   XLogRecPtr sublsn);
 extern char GetSubscriptionRelState(Oid subid, Oid relid, XLogRecPtr *sublsn);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index d659adbfd6..0168f10348 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2665,6 +2665,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#164Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#163)
1 attachment(s)
Re: pg_upgrade and logical replication

On Fri, Dec 1, 2023 at 11:24 PM vignesh C <vignesh21@gmail.com> wrote:

The attached v22 version patch has the changes for the same.

I have made minor changes in the comments and code at various places.
See and let me know if you are not happy with the changes. I think
unless there are more suggestions or comments, we can proceed with
committing it.

--
With Regards,
Amit Kapila.

Attachments:

v23-0001-Allow-upgrades-to-preserve-the-full-subscription.patchapplication/octet-stream; name=v23-0001-Allow-upgrades-to-preserve-the-full-subscription.patchDownload
From 3258b39b24b77f5cedfd5ee4fa729b98353b53c3 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 30 Oct 2023 12:31:59 +0530
Subject: [PATCH v23] Allow upgrades to preserve the full subscription's state.

This feature will allow us to replicate the changes on subscriber nodes
after the upgrade.

Previously, only the subscription metadata information was preserved.
Without the list of relations and their state, it's not possible to
re-enable the subscriptions without missing some records as the list of
relations can only be refreshed after enabling the subscription (and
therefore starting the apply worker).  Even if we added a way to refresh
the subscription while enabling a publication, we still wouldn't know
which relations are new on the publication side, and therefore should be
fully synced, and which shouldn't.

To preserve the subscription relations, this patch teaches pg_dump to
restore the content of pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To preserve the replication origins, this patch teaches pg_dump to update
the replication origin along with creating a subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

pg_upgrade will check that all the subscription relations are in 'i'
(init) or in 'r' (ready) state and will error out if that's not the case,
logging the reason for the failure. This helps to avoid the risk of any
dangling slot or origin after the upgrade.

Author: Vignesh C, Julien Rouhaud, Shlok Kyal
Reviewed-by: Peter Smith, Amit Kapila, Michael Paquier, Hayato Kuroda
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  50 ++++
 src/backend/catalog/pg_subscription.c      |  16 +-
 src/backend/commands/subscriptioncmds.c    |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c | 129 +++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 232 +++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  24 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 193 ++++++++++++-
 src/bin/pg_upgrade/info.c                  |  56 +++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 306 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/include/catalog/pg_subscription_rel.h  |   2 +-
 src/tools/pgindent/typedefs.list           |   1 +
 16 files changed, 1044 insertions(+), 15 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..ee6f6288e5 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,56 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription's table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription's replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize) or <literal>r</literal> (ready). This
+       can be verified by checking <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d6a978f136..7167377d82 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -228,10 +228,14 @@ textarray_to_stringlist(ArrayType *textarray)
 
 /*
  * Add new state record for a subscription table.
+ *
+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in binary-upgrade
+ * mode, we expect to release those immediately.
  */
 void
 AddSubscriptionRelState(Oid subid, Oid relid, char state,
-						XLogRecPtr sublsn)
+						XLogRecPtr sublsn, bool retain_lock)
 {
 	Relation	rel;
 	HeapTuple	tup;
@@ -269,7 +273,15 @@ AddSubscriptionRelState(Oid subid, Oid relid, char state,
 	heap_freetuple(tup);
 
 	/* Cleanup. */
-	table_close(rel, NoLock);
+	if (retain_lock)
+	{
+		table_close(rel, NoLock);
+	}
+	else
+	{
+		table_close(rel, RowExclusiveLock);
+		UnlockSharedObject(SubscriptionRelationId, subid, 0, AccessShareLock);
+	}
 }
 
 /*
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index edc82c11be..dd067d39ad 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -773,7 +773,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 										 rv->schemaname, rv->relname);
 
 				AddSubscriptionRelState(subid, relid, table_state,
-										InvalidXLogRecPtr);
+										InvalidXLogRecPtr, true);
 			}
 
 			/*
@@ -943,7 +943,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 			{
 				AddSubscriptionRelState(sub->oid, relid,
 										copy_data ? SUBREL_STATE_INIT : SUBREL_STATE_READY,
-										InvalidXLogRecPtr);
+										InvalidXLogRecPtr, true);
 				ereport(DEBUG1,
 						(errmsg_internal("table \"%s.%s\" added to subscription \"%s\"",
 										 rv->schemaname, rv->relname, sub->name)));
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 2f6fc86c3d..82a6b2267f 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,23 @@
 
 #include "postgres.h"
 
+#include "access/relation.h"
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +313,124 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	subrel;
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+	ReleaseSysCache(tup);
+
+	subrel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	rel = relation_open(relid, AccessShareLock);
+	AddSubscriptionRelState(subid, relid, relstate, sublsn, false);
+	relation_close(rel, AccessShareLock);
+
+	ReleaseSysCache(tup);
+	table_close(subrel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	remote_commit;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8c0b5486b9..de4ca03b8b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -297,6 +297,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4618,6 +4619,8 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
+	int			i_subenabled;
 	int			i,
 				ntups;
 
@@ -4673,16 +4676,30 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n"
+							 " s.subenabled\n");
+	else
+		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n"
+							 " false AS subenabled\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4709,6 +4726,8 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
+	i_subenabled = PQfnumber(res, "subenabled");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4746,6 +4765,13 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+		subinfo[i].subenabled =
+			pg_strdup(PQgetvalue(res, i, i_subenabled));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4755,6 +4781,165 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PQExpBuffer query;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	query = createPQExpBuffer();
+	appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn"
+					  " FROM pg_catalog.pg_subscription_rel"
+					  " ORDER BY srsubid");
+	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+	destroyPQExpBuffer(query);
+}
+
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade && fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4835,6 +5020,43 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	/*
+	 * In binary-upgrade mode, we allow the replication to continue after the
+	 * upgrade.
+	 */
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+	{
+		if (subinfo->suboriginremotelsn)
+		{
+			/*
+			 * Preserve the remote_lsn for the subscriber's replication
+			 * origin. This value is required to start the replication from
+			 * the position before the upgrade. This value will be stale if
+			 * the publisher gets upgraded before the subscriber node.
+			 * However, this shouldn't be a problem as the upgrade ensures
+			 * that all the transactions were replicated before upgrading the
+			 * publisher.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+			appendStringLiteralAH(query, subinfo->dobj.name, fout);
+			appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+		}
+
+		if (strcmp(subinfo->subenabled, "t") == 0)
+		{
+			/*
+			 * Enable the subscription to allow the replication to continue
+			 * after the upgrade.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber's running state.\n");
+			appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s ENABLE;\n", qsubname);
+		}
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10453,6 +10675,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18519,6 +18744,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..552b93e3a6 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,		/* see note for SubRelInfo */
 } DumpableObjectType;
 
 /*
@@ -660,6 +661,7 @@ typedef struct _SubscriptionInfo
 {
 	DumpableObject dobj;
 	const char *rolname;
+	char	   *subenabled;
 	char	   *subbinary;
 	char	   *substream;
 	char	   *subtwophasestate;
@@ -671,8 +673,28 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ *
+ * Note: Currently the subscription tables are added to the subscription after
+ * enabling the subscription in binary-upgrade mode. As the apply workers will
+ * not be started in binary_upgrade mode the ordering of enable subscription
+ * does not matter. The order of adding the subscription tables to the
+ * subscription and enable subscription should be taken care if this feature
+ * will be supported in non-binary-upgrade mode in the future.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +719,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +779,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..5a1ebac4b1 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state(void);
 
 
 /*
@@ -112,13 +114,21 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/*
-	 * Logical replication slots can be migrated since PG17. See comments atop
-	 * get_old_cluster_logical_slot_infos().
-	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+	{
+		/*
+		 * Logical replication slots can be migrated since PG17. See comments
+		 * atop get_old_cluster_logical_slot_infos().
+		 */
 		check_old_cluster_for_valid_slots(live_check);
 
+		/*
+		 * Subscriptions and their dependencies can be migrated since PG17.
+		 * See comments atop get_db_subscription_count().
+		 */
+		check_old_cluster_subscription_state();
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -237,6 +247,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1538,6 +1550,53 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions. This is required to create the replication
+ * origin for each subscription.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Subscriptions and their dependencies can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated. */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name = 'max_replication_slots';");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1672,129 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that the replication origin corresponding to each of the
+ * subscriptions are present and each of the subscribed tables is in
+ * 'i' (initialize) or 'r' (ready) state.
+ */
+static void
+check_old_cluster_subscription_state(void)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &old_cluster.dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(&old_cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "The replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * We don't allow upgrade if there is a risk of dangling slot or
+		 * origin corresponding to initial sync after upgrade.
+		 *
+		 * A slot/origin not created yet refers to the 'i' (initialize) state,
+		 * while 'r' (ready) state refers to a slot/origin created previously
+		 * but already dropped. These states are supported for pg_upgrade. The
+		 * other states listed below are not supported:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * used for the slot name won't match anymore.
+		 *
+		 * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+		 * would retain the replication origin when there is a failure in
+		 * tablesync worker immediately after dropping the replication slot in
+		 * the publisher.
+		 *
+		 * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT r.srsubstate, s.subname, n.nspname, c.relname "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "The table sync state \"%s\" is not allowed for database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n",
+					PQgetvalue(res, i, 0),
+					active_db->db_name,
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize) or r (ready) state.\n"
+				 "You can allow the initial sync to finish for all relations and then restart the upgrade.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..cc73c0fc0c 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,6 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
+static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,10 +294,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_db_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -730,6 +735,55 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_db_subscription_count()
+ *
+ * Gets the number of subscriptions of the database referred to by "dbinfo".
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_db_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscriptions for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_db_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..716adc51d1
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,306 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+# In a VPATH build, we'll be started in the source directory, but we want
+# to run pg_upgrade in the build directory so that any files generated finish
+# in it, like delete_old_cluster.{sh,bat}.
+chdir ${PostgreSQL::Test::Utils::tmp_check};
+
+# Initial setup
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE TABLE tab_upgraded2(id int);
+]);
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE TABLE tab_upgraded2(id int);
+]);
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+# Setup an enabled subscription to verify that the running status is retained
+# after upgrade.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Verify that the upgrade should be successful with tables in 'ready'/'init'
+# state along with retaining the replication origin remote lsn, and
+# subscription running status.
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub2 FOR TABLE tab_upgraded1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2"
+);
+# a) Wait till the table tab_upgraded1 reaches 'ready' state
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))");
+$publisher->wait_for_catchup('regress_sub2');
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+# b) The table tab_upgraded2 will be in init state as the subscriber
+# configuration for max_logical_replication_workers is set to 0.
+my $result = $old_sub->safe_psql('postgres',
+	"SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
+is($result, qq(t), "Check that the table is in init state");
+# c) Get the replication origin remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub2'"
+);
+# d) Have the subscription in disabled state before upgrade
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state (tab_upgraded1 table is in ready state and tab_upgraded2 table is
+# in init state) along with retaining the replication origin remote lsn
+# and subscription running status.
+# ------------------------------------------------------
+command_ok(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in init/ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# ------------------------------------------------------
+# Check that the data inserted to the publisher when the new subscriber is down
+# will be replicated once it is started. Also check that the old subscription
+# states and relations origins are all preserved.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		INSERT INTO tab_upgraded1 VALUES(51);
+		INSERT INTO tab_upgraded2 VALUES(1);
+]);
+
+$new_sub->start;
+
+# The subscription's running status should be preserved. Old subscription
+# regress_sub1 should be enabled and old subscription regress_sub2 should be
+# disabled.
+$result =
+  $new_sub->safe_psql('postgres',
+	"SELECT subenabled FROM pg_subscription ORDER BY subname");
+is( $result, qq(t
+f),
+	"check that the subscription's running status are preserved");
+
+my $sub_oid = $new_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+
+# Subscription relations should be preserved
+$result =
+  $new_sub->safe_psql('postgres',
+	"SELECT count(*) FROM pg_subscription_rel WHERE srsubid = $sub_oid");
+is($result, qq(2),
+	"there should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status WHERE external_id = 'pg_' || $sub_oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
+$publisher->wait_for_catchup('regress_sub2');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(1),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+$old_sub->safe_psql(
+	'postgres', qq[
+		ALTER SUBSCRIPTION regress_sub1 DISABLE;
+		ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none);
+		DROP SUBSCRIPTION regress_sub1;
+]);
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+# Drop the subscription
+$old_sub->start;
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if:
+# a) there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE PUBLICATION regress_pub3 FOR TABLE tab_primary_key;
+]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3;
+]);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled=false)"
+);
+my $subid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
+my $reporigin = 'pg_' . qq($subid);
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub3\" schema:\"public\" relation:\"tab_primary_key\"/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub4 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub4\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fb58dee3bc..45c681db5e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11396,6 +11396,16 @@
   provolatile => 'v', proparallel => 'u', prorettype => 'bool',
   proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index f5324b710d..34ec3117a3 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -81,7 +81,7 @@ typedef struct SubscriptionRelState
 } SubscriptionRelState;
 
 extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
-									XLogRecPtr sublsn);
+									XLogRecPtr sublsn, bool retain_lock);
 extern void UpdateSubscriptionRelState(Oid subid, Oid relid, char state,
 									   XLogRecPtr sublsn);
 extern char GetSubscriptionRelState(Oid subid, Oid relid, XLogRecPtr *sublsn);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index d659adbfd6..0168f10348 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2665,6 +2665,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.28.0.windows.1

#165Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#164)
Re: pg_upgrade and logical replication

On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:

I have made minor changes in the comments and code at various places.
See and let me know if you are not happy with the changes. I think
unless there are more suggestions or comments, we can proceed with
committing it.

Yeah. I am planning to look more closely at what you have here, and
it is going to take me a bit more time though (some more stuff planned
for next CF, an upcoming conference and end/beginning-of-year
vacations), but I think that targetting the beginning of next CF in
January would be OK.

Overall, I have the impression that the patch looks pretty solid, with
a restriction in place for "init" and "ready" relations, while there
are tests to check all the states that we expect. Seeing coverage
about all that makes me a happy hacker.

+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in binary-upgrade
+ * mode, we expect to release those immediately.

I think that this should be documented in pg_upgrade_support.c where
the caller expects the locks to be released, and why these should be
released. There is a risk that this comment becomes obsolete if
AddSubscriptionRelState() with locks released is called in a different
code path. Anyway, I am not sure to get why this is OK, or even
necessary. It seems like a good practice to keep the locks on the
subscription until the transaction that updates its state. If there's
a specific reason explaining why that's better, the patch should tell
why.

+     * However, this shouldn't be a problem as the upgrade ensures
+     * that all the transactions were replicated before upgrading the
+     * publisher.

This wording looks a bit confusing to me, as "the upgrade" could refer
to the upgrade of a subscriber, but what we want to tell is that the
replay of the transactions is enforced when doing a publisher upgrade.
I'd suggest something like "the upgrade of the publisher ensures that
all the transactions were replicated before upgrading it".

+my $result = $old_sub->safe_psql('postgres',
+   "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
+is($result, qq(t), "Check that the table is in init state");

Hmm. Not sure that this is safe. Shouldn't this be a
poll_query_until(), polling that the state of the relation is what we
want it to be after requesting a fresh of the publication on the
subscriber?
--
Michael

#166Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#165)
Re: pg_upgrade and logical replication

On Tue, Dec 5, 2023 at 10:56 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:

I have made minor changes in the comments and code at various places.
See and let me know if you are not happy with the changes. I think
unless there are more suggestions or comments, we can proceed with
committing it.

Yeah. I am planning to look more closely at what you have here, and
it is going to take me a bit more time though (some more stuff planned
for next CF, an upcoming conference and end/beginning-of-year
vacations), but I think that targetting the beginning of next CF in
January would be OK.

Overall, I have the impression that the patch looks pretty solid, with
a restriction in place for "init" and "ready" relations, while there
are tests to check all the states that we expect. Seeing coverage
about all that makes me a happy hacker.

+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in binary-upgrade
+ * mode, we expect to release those immediately.

I think that this should be documented in pg_upgrade_support.c where
the caller expects the locks to be released, and why these should be
released. There is a risk that this comment becomes obsolete if
AddSubscriptionRelState() with locks released is called in a different
code path. Anyway, I am not sure to get why this is OK, or even
necessary. It seems like a good practice to keep the locks on the
subscription until the transaction that updates its state. If there's
a specific reason explaining why that's better, the patch should tell
why.

It is to be consistent with other code paths in the upgrade. We
followed existing coding rules like what we do in
binary_upgrade_set_missing_value->SetAttrMissing(). The probable
theory is that during the upgrade we are not worried about concurrent
operations being blocked till the transaction ends. As in this
particular case, we know that the apply worker won't try to sync any
of those relations or a concurrent DDL won't try to remove it from the
pg_subscrition_rel. This point is not being explicitly commented
because of its similarity with the existing code.

+my $result = $old_sub->safe_psql('postgres',
+   "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
+is($result, qq(t), "Check that the table is in init state");

Hmm. Not sure that this is safe. Shouldn't this be a
poll_query_until(), polling that the state of the relation is what we
want it to be after requesting a fresh of the publication on the
subscriber?

This is safe because the init state should be marked by the "Alter
Subscription ... Refresh .." command itself. What exactly makes you
think that such a poll would be required?

--
With Regards,
Amit Kapila.

#167Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#164)
Re: pg_upgrade and logical replication

On Mon, Dec 4, 2023 at 8:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Dec 1, 2023 at 11:24 PM vignesh C <vignesh21@gmail.com> wrote:

The attached v22 version patch has the changes for the same.

I have made minor changes in the comments and code at various places.
See and let me know if you are not happy with the changes. I think
unless there are more suggestions or comments, we can proceed with
committing it.

It seems the patch is already close to ready-to-commit state but I've
had a look at the v23 patch with fresh eyes. It looks mostly good to
me and there are some minor comments:

---
+   tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+   if (!HeapTupleIsValid(tup))
+       ereport(ERROR,
+               errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+               errmsg("relation %u does not exist", relid));
+   ReleaseSysCache(tup);

Given what we want to do here is just an existence check, isn't it
clearer if we use SearchSysCacheExists1() instead?

---
+        query = createPQExpBuffer();
+        appendPQExpBuffer(query, "SELECT srsubid, srrelid,
srsubstate, srsublsn"
+                                          " FROM
pg_catalog.pg_subscription_rel"
+                                          " ORDER BY srsubid");
+        res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+

Probably we don't need to use PQExpBuffer here since the query to
execute is a static string.

---
+# The subscription's running status should be preserved. Old subscription
+# regress_sub1 should be enabled and old subscription regress_sub2 should be
+# disabled.
+$result =
+  $new_sub->safe_psql('postgres',
+        "SELECT subenabled FROM pg_subscription ORDER BY subname");
+is( $result, qq(t
+f),
+        "check that the subscription's running status are preserved");
+

How about showing the subname along with the subenabled so that we can
check if each subscription is in an expected state in case where
something error happens?

---
+# Subscription relations should be preserved
+$result =
+  $new_sub->safe_psql('postgres',
+        "SELECT count(*) FROM pg_subscription_rel WHERE srsubid = $sub_oid");
+is($result, qq(2),
+        "there should be 2 rows in pg_subscription_rel(representing
tab_upgraded1 and tab_upgraded2)"
+);

Is there any reason why we check only the number of rows in
pg_subscription_rel? I guess it might be a good idea to check if table
OIDs there are also preserved.

---
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
+$publisher->wait_for_catchup('regress_sub2');
+

IIUC after making the subscription regress_sub2 enabled, we will start
the initial table sync for the table tab_upgraded2. If so, shouldn't
we use wait_for_subscription_sync() instead?

---
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+        "CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr'
PUBLICATION regress_pub3 WITH (enabled=false)"

It's better to put spaces before and after '='.

---
+my $subid = $old_sub->safe_psql('postgres',
+        "SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");

I think we can reuse $sub_oid.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#168Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#166)
Re: pg_upgrade and logical replication

On Tue, Dec 5, 2023 at 6:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 5, 2023 at 10:56 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:

I have made minor changes in the comments and code at various places.
See and let me know if you are not happy with the changes. I think
unless there are more suggestions or comments, we can proceed with
committing it.

Yeah. I am planning to look more closely at what you have here, and
it is going to take me a bit more time though (some more stuff planned
for next CF, an upcoming conference and end/beginning-of-year
vacations), but I think that targetting the beginning of next CF in
January would be OK.

Overall, I have the impression that the patch looks pretty solid, with
a restriction in place for "init" and "ready" relations, while there
are tests to check all the states that we expect. Seeing coverage
about all that makes me a happy hacker.

+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in binary-upgrade
+ * mode, we expect to release those immediately.

I think that this should be documented in pg_upgrade_support.c where
the caller expects the locks to be released, and why these should be
released. There is a risk that this comment becomes obsolete if
AddSubscriptionRelState() with locks released is called in a different
code path. Anyway, I am not sure to get why this is OK, or even
necessary. It seems like a good practice to keep the locks on the
subscription until the transaction that updates its state. If there's
a specific reason explaining why that's better, the patch should tell
why.

It is to be consistent with other code paths in the upgrade. We
followed existing coding rules like what we do in
binary_upgrade_set_missing_value->SetAttrMissing(). The probable
theory is that during the upgrade we are not worried about concurrent
operations being blocked till the transaction ends. As in this
particular case, we know that the apply worker won't try to sync any
of those relations or a concurrent DDL won't try to remove it from the
pg_subscrition_rel. This point is not being explicitly commented
because of its similarity with the existing code.

It seems no problem to me with releasing locks early, I'm not sure how
much it helps in better concurrency as it acquires lower level locks
such as AccessShareLock and RowExclusiveLock though (SetAttrMissing()
acquires AccessExclusiveLock on the table on the other hand).

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#169Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#168)
Re: pg_upgrade and logical replication

On Thu, Dec 7, 2023 at 7:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Dec 5, 2023 at 6:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Dec 5, 2023 at 10:56 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:

I have made minor changes in the comments and code at various places.
See and let me know if you are not happy with the changes. I think
unless there are more suggestions or comments, we can proceed with
committing it.

Yeah. I am planning to look more closely at what you have here, and
it is going to take me a bit more time though (some more stuff planned
for next CF, an upcoming conference and end/beginning-of-year
vacations), but I think that targetting the beginning of next CF in
January would be OK.

Overall, I have the impression that the patch looks pretty solid, with
a restriction in place for "init" and "ready" relations, while there
are tests to check all the states that we expect. Seeing coverage
about all that makes me a happy hacker.

+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in binary-upgrade
+ * mode, we expect to release those immediately.

I think that this should be documented in pg_upgrade_support.c where
the caller expects the locks to be released, and why these should be
released. There is a risk that this comment becomes obsolete if
AddSubscriptionRelState() with locks released is called in a different
code path. Anyway, I am not sure to get why this is OK, or even
necessary. It seems like a good practice to keep the locks on the
subscription until the transaction that updates its state. If there's
a specific reason explaining why that's better, the patch should tell
why.

It is to be consistent with other code paths in the upgrade. We
followed existing coding rules like what we do in
binary_upgrade_set_missing_value->SetAttrMissing(). The probable
theory is that during the upgrade we are not worried about concurrent
operations being blocked till the transaction ends. As in this
particular case, we know that the apply worker won't try to sync any
of those relations or a concurrent DDL won't try to remove it from the
pg_subscrition_rel. This point is not being explicitly commented
because of its similarity with the existing code.

It seems no problem to me with releasing locks early, I'm not sure how
much it helps in better concurrency as it acquires lower level locks
such as AccessShareLock and RowExclusiveLock though (SetAttrMissing()
acquires AccessExclusiveLock on the table on the other hand).

True, but we have kept it that way from the consistency point of view
as well. We can change it if you think otherwise.

--
With Regards,
Amit Kapila.

#170Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Amit Kapila (#169)
RE: pg_upgrade and logical replication

On Thursday, December 7, 2023 10:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Dec 7, 2023 at 7:26 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Tue, Dec 5, 2023 at 6:37 PM Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Tue, Dec 5, 2023 at 10:56 AM Michael Paquier <michael@paquier.xyz>

wrote:

On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:

I have made minor changes in the comments and code at various

places.

See and let me know if you are not happy with the changes. I
think unless there are more suggestions or comments, we can
proceed with committing it.

Yeah. I am planning to look more closely at what you have here,
and it is going to take me a bit more time though (some more stuff
planned for next CF, an upcoming conference and
end/beginning-of-year vacations), but I think that targetting the
beginning of next CF in January would be OK.

Overall, I have the impression that the patch looks pretty solid,
with a restriction in place for "init" and "ready" relations,
while there are tests to check all the states that we expect.
Seeing coverage about all that makes me a happy hacker.

+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in
+ binary-upgrade
+ * mode, we expect to release those immediately.

I think that this should be documented in pg_upgrade_support.c
where the caller expects the locks to be released, and why these
should be released. There is a risk that this comment becomes
obsolete if
AddSubscriptionRelState() with locks released is called in a
different code path. Anyway, I am not sure to get why this is OK,
or even necessary. It seems like a good practice to keep the
locks on the subscription until the transaction that updates its
state. If there's a specific reason explaining why that's better,
the patch should tell why.

It is to be consistent with other code paths in the upgrade. We
followed existing coding rules like what we do in
binary_upgrade_set_missing_value->SetAttrMissing(). The probable
theory is that during the upgrade we are not worried about
concurrent operations being blocked till the transaction ends. As in
this particular case, we know that the apply worker won't try to
sync any of those relations or a concurrent DDL won't try to remove
it from the pg_subscrition_rel. This point is not being explicitly
commented because of its similarity with the existing code.

It seems no problem to me with releasing locks early, I'm not sure how
much it helps in better concurrency as it acquires lower level locks
such as AccessShareLock and RowExclusiveLock though (SetAttrMissing()
acquires AccessExclusiveLock on the table on the other hand).

True, but we have kept it that way from the consistency point of view as well.
We can change it if you think otherwise.

I also look into the patch and didn't find problems about the locking in
AddSubscriptionRelState.

About concurrency stuff, the lock on subscription object and
pg_subscription_rel only conflicts with ALTER/DROP SUBSCRIPTION which holds
AccessExclusiveLock lock, but since there are not concurrent ALTER
SUBSCRIPTION cmds during upgrade, so I think it's OK to release it earlier.

I also thought about the cache invalidation stuff as we modified the catalog
which will generate catcahe invalidateion. But the apply worker which build
cache based on the pg_subscription_rel is not running, and no concurrent
ALTER/DROP SUBSCRIPTION cmds will be executed, so it looks OK as well.

Best Regards,
Hou zj

#171vignesh C
vignesh21@gmail.com
In reply to: Michael Paquier (#165)
1 attachment(s)
Re: pg_upgrade and logical replication

On Tue, 5 Dec 2023 at 10:56, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:

I have made minor changes in the comments and code at various places.
See and let me know if you are not happy with the changes. I think
unless there are more suggestions or comments, we can proceed with
committing it.

Yeah. I am planning to look more closely at what you have here, and
it is going to take me a bit more time though (some more stuff planned
for next CF, an upcoming conference and end/beginning-of-year
vacations), but I think that targetting the beginning of next CF in
January would be OK.

Overall, I have the impression that the patch looks pretty solid, with
a restriction in place for "init" and "ready" relations, while there
are tests to check all the states that we expect. Seeing coverage
about all that makes me a happy hacker.

+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in binary-upgrade
+ * mode, we expect to release those immediately.

I think that this should be documented in pg_upgrade_support.c where
the caller expects the locks to be released, and why these should be
released. There is a risk that this comment becomes obsolete if
AddSubscriptionRelState() with locks released is called in a different
code path. Anyway, I am not sure to get why this is OK, or even
necessary. It seems like a good practice to keep the locks on the
subscription until the transaction that updates its state. If there's
a specific reason explaining why that's better, the patch should tell
why.

Added comments for this.

+     * However, this shouldn't be a problem as the upgrade ensures
+     * that all the transactions were replicated before upgrading the
+     * publisher.
This wording looks a bit confusing to me, as "the upgrade" could refer
to the upgrade of a subscriber, but what we want to tell is that the
replay of the transactions is enforced when doing a publisher upgrade.
I'd suggest something like "the upgrade of the publisher ensures that
all the transactions were replicated before upgrading it".

Modified

+my $result = $old_sub->safe_psql('postgres',
+   "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
+is($result, qq(t), "Check that the table is in init state");

Hmm. Not sure that this is safe. Shouldn't this be a
poll_query_until(), polling that the state of the relation is what we
want it to be after requesting a fresh of the publication on the
subscriber?

This is not required as the table will be added in init state after
"Alter Subscription ... Refresh .." command itself.

Thanks for the comments, the attached v24 version patch has the
changes for the same.

Regards,
Vignesh

Attachments:

v24-0001-Allow-upgrades-to-preserve-the-full-subscription.patchtext/x-patch; charset=US-ASCII; name=v24-0001-Allow-upgrades-to-preserve-the-full-subscription.patchDownload
From dcec22de8d37a0d06e268b3747cd58a7b5692381 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Thu, 7 Dec 2023 09:50:27 +0530
Subject: [PATCH v24] Allow upgrades to preserve the full subscription's state.

This feature will allow us to replicate the changes on subscriber nodes
after the upgrade.

Previously, only the subscription metadata information was preserved.
Without the list of relations and their state, it's not possible to
re-enable the subscriptions without missing some records as the list of
relations can only be refreshed after enabling the subscription (and
therefore starting the apply worker).  Even if we added a way to refresh
the subscription while enabling a publication, we still wouldn't know
which relations are new on the publication side, and therefore should be
fully synced, and which shouldn't.

To preserve the subscription relations, this patch teaches pg_dump to
restore the content of pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To preserve the replication origins, this patch teaches pg_dump to update
the replication origin along with creating a subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

pg_upgrade will check that all the subscription relations are in 'i'
(init) or in 'r' (ready) state and will error out if that's not the case,
logging the reason for the failure. This helps to avoid the risk of any
dangling slot or origin after the upgrade.

Author: Vignesh C, Julien Rouhaud, Shlok Kyal
Reviewed-by: Peter Smith, Amit Kapila, Michael Paquier, Hayato Kuroda
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  50 ++++
 src/backend/catalog/pg_subscription.c      |  16 +-
 src/backend/commands/subscriptioncmds.c    |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c | 134 +++++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 229 ++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  24 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 193 ++++++++++++-
 src/bin/pg_upgrade/info.c                  |  56 +++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 319 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/include/catalog/pg_subscription_rel.h  |   2 +-
 src/tools/pgindent/typedefs.list           |   1 +
 16 files changed, 1059 insertions(+), 15 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..ee6f6288e5 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,56 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription's table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription's replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize) or <literal>r</literal> (ready). This
+       can be verified by checking <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d6a978f136..7167377d82 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -228,10 +228,14 @@ textarray_to_stringlist(ArrayType *textarray)
 
 /*
  * Add new state record for a subscription table.
+ *
+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in binary-upgrade
+ * mode, we expect to release those immediately.
  */
 void
 AddSubscriptionRelState(Oid subid, Oid relid, char state,
-						XLogRecPtr sublsn)
+						XLogRecPtr sublsn, bool retain_lock)
 {
 	Relation	rel;
 	HeapTuple	tup;
@@ -269,7 +273,15 @@ AddSubscriptionRelState(Oid subid, Oid relid, char state,
 	heap_freetuple(tup);
 
 	/* Cleanup. */
-	table_close(rel, NoLock);
+	if (retain_lock)
+	{
+		table_close(rel, NoLock);
+	}
+	else
+	{
+		table_close(rel, RowExclusiveLock);
+		UnlockSharedObject(SubscriptionRelationId, subid, 0, AccessShareLock);
+	}
 }
 
 /*
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index edc82c11be..dd067d39ad 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -773,7 +773,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 										 rv->schemaname, rv->relname);
 
 				AddSubscriptionRelState(subid, relid, table_state,
-										InvalidXLogRecPtr);
+										InvalidXLogRecPtr, true);
 			}
 
 			/*
@@ -943,7 +943,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 			{
 				AddSubscriptionRelState(sub->oid, relid,
 										copy_data ? SUBREL_STATE_INIT : SUBREL_STATE_READY,
-										InvalidXLogRecPtr);
+										InvalidXLogRecPtr, true);
 				ereport(DEBUG1,
 						(errmsg_internal("table \"%s.%s\" added to subscription \"%s\"",
 										 rv->schemaname, rv->relname, sub->name)));
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 92921b0239..e1ed6cd7e9 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,23 @@
 
 #include "postgres.h"
 
+#include "access/relation.h"
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +313,129 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	subrel;
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation %u does not exist", relid));
+
+	subrel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+						  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	rel = relation_open(relid, AccessShareLock);
+
+	/*
+	 * Since there are no concurrent ALTER/DROP SUBSCRIPTION commands during
+	 * the upgrade process, and the apply worker (which builds cache based on
+	 * the subscription catalog) is not running, the locks can be released
+	 * immediately.
+	 */
+	AddSubscriptionRelState(subid, relid, relstate, sublsn, false);
+	relation_close(rel, AccessShareLock);
+
+	ReleaseSysCache(tup);
+	table_close(subrel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	HeapTuple	tup;
+	Oid			subid;
+	Form_pg_subscription form;
+	char	   *subname;
+	XLogRecPtr	remote_commit;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	/* Fetch the existing tuple. */
+	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("subscription \"%s\" does not exist", subname));
+
+	form = (Form_pg_subscription) GETSTRUCT(tup);
+	subid = form->oid;
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	heap_freetuple(tup);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8c0b5486b9..452bd1545e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -297,6 +297,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4618,6 +4619,8 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
+	int			i_subenabled;
 	int			i,
 				ntups;
 
@@ -4673,16 +4676,30 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n"
+							 " s.subenabled\n");
+	else
+		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n"
+							 " false AS subenabled\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4709,6 +4726,8 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
+	i_subenabled = PQfnumber(res, "subenabled");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4746,6 +4765,13 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+		subinfo[i].subenabled =
+			pg_strdup(PQgetvalue(res, i, i_subenabled));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4755,6 +4781,162 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	res = ExecuteSqlQuery(fout,
+						  "SELECT srsubid, srrelid, srsubstate, srsublsn "
+						  "FROM pg_catalog.pg_subscription_rel "
+						  "ORDER BY srsubid",
+						  PGRES_TUPLES_OK);
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+}
+
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade && fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4835,6 +5017,43 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	/*
+	 * In binary-upgrade mode, we allow the replication to continue after the
+	 * upgrade.
+	 */
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+	{
+		if (subinfo->suboriginremotelsn)
+		{
+			/*
+			 * Preserve the remote_lsn for the subscriber's replication
+			 * origin. This value is required to start the replication from
+			 * the position before the upgrade. This value will be stale if
+			 * the publisher gets upgraded before the subscriber node.
+			 * However, this shouldn't be a problem as the upgrade of the
+			 * publisher ensures that all the transactions were replicated
+			 * before upgrading it.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+			appendStringLiteralAH(query, subinfo->dobj.name, fout);
+			appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+		}
+
+		if (strcmp(subinfo->subenabled, "t") == 0)
+		{
+			/*
+			 * Enable the subscription to allow the replication to continue
+			 * after the upgrade.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber's running state.\n");
+			appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s ENABLE;\n", qsubname);
+		}
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10453,6 +10672,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18519,6 +18741,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..20723d3a60 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,		/* see note for SubRelInfo */
 } DumpableObjectType;
 
 /*
@@ -660,6 +661,7 @@ typedef struct _SubscriptionInfo
 {
 	DumpableObject dobj;
 	const char *rolname;
+	char	   *subenabled;
 	char	   *subbinary;
 	char	   *substream;
 	char	   *subtwophasestate;
@@ -671,8 +673,28 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ *
+ * XXX Currently the subscription tables are added to the subscription after
+ * enabling the subscription in binary-upgrade mode. As the apply workers will
+ * not be started in binary_upgrade mode the ordering of enable subscription
+ * does not matter. The order of adding the subscription tables to the
+ * subscription and enable subscription should be taken care if this feature
+ * will be supported in non-binary-upgrade mode in the future.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +719,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +779,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..5a1ebac4b1 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state(void);
 
 
 /*
@@ -112,13 +114,21 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/*
-	 * Logical replication slots can be migrated since PG17. See comments atop
-	 * get_old_cluster_logical_slot_infos().
-	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+	{
+		/*
+		 * Logical replication slots can be migrated since PG17. See comments
+		 * atop get_old_cluster_logical_slot_infos().
+		 */
 		check_old_cluster_for_valid_slots(live_check);
 
+		/*
+		 * Subscriptions and their dependencies can be migrated since PG17.
+		 * See comments atop get_db_subscription_count().
+		 */
+		check_old_cluster_subscription_state();
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -237,6 +247,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1538,6 +1550,53 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions. This is required to create the replication
+ * origin for each subscription.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Subscriptions and their dependencies can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated. */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name = 'max_replication_slots';");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1672,129 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that the replication origin corresponding to each of the
+ * subscriptions are present and each of the subscribed tables is in
+ * 'i' (initialize) or 'r' (ready) state.
+ */
+static void
+check_old_cluster_subscription_state(void)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &old_cluster.dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(&old_cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "The replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * We don't allow upgrade if there is a risk of dangling slot or
+		 * origin corresponding to initial sync after upgrade.
+		 *
+		 * A slot/origin not created yet refers to the 'i' (initialize) state,
+		 * while 'r' (ready) state refers to a slot/origin created previously
+		 * but already dropped. These states are supported for pg_upgrade. The
+		 * other states listed below are not supported:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * used for the slot name won't match anymore.
+		 *
+		 * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+		 * would retain the replication origin when there is a failure in
+		 * tablesync worker immediately after dropping the replication slot in
+		 * the publisher.
+		 *
+		 * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT r.srsubstate, s.subname, n.nspname, c.relname "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "The table sync state \"%s\" is not allowed for database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n",
+					PQgetvalue(res, i, 0),
+					active_db->db_name,
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize) or r (ready) state.\n"
+				 "You can allow the initial sync to finish for all relations and then restart the upgrade.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..cc73c0fc0c 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,6 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
+static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,10 +294,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_db_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -730,6 +735,55 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_db_subscription_count()
+ *
+ * Gets the number of subscriptions of the database referred to by "dbinfo".
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_db_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscriptions for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_db_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..b4ddc20c52
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,319 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+# In a VPATH build, we'll be started in the source directory, but we want
+# to run pg_upgrade in the build directory so that any files generated finish
+# in it, like delete_old_cluster.{sh,bat}.
+chdir ${PostgreSQL::Test::Utils::tmp_check};
+
+# Initial setup
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE TABLE tab_upgraded2(id int);
+]);
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE TABLE tab_upgraded2(id int);
+]);
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+# Setup an enabled subscription to verify that the running status is retained
+# after upgrade.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Verify that the upgrade should be successful with tables in 'ready'/'init'
+# state along with retaining the replication origin remote lsn, and
+# subscription running status.
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub2 FOR TABLE tab_upgraded1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2"
+);
+# Wait till the table tab_upgraded1 reaches 'ready' state
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))");
+$publisher->wait_for_catchup('regress_sub2');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+
+# The table tab_upgraded2 will be in init state as the subscriber
+# configuration for max_logical_replication_workers is set to 0.
+my $result = $old_sub->safe_psql('postgres',
+	"SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
+is($result, qq(t), "Check that the table is in init state");
+
+# Get the replication origin remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub2'"
+);
+# Have the subscription in disabled state before upgrade
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+
+my $tab_upgraded1_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_class WHERE relname = 'tab_upgraded1'");
+my $tab_upgraded2_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_class WHERE relname = 'tab_upgraded2'");
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state (tab_upgraded1 table is in ready state and tab_upgraded2 table is
+# in init state) along with retaining the replication origin remote lsn
+# and subscription running status.
+# ------------------------------------------------------
+command_ok(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in init/ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# ------------------------------------------------------
+# Check that the data inserted to the publisher when the new subscriber is down
+# will be replicated once it is started. Also check that the old subscription
+# states and relations origins are all preserved.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		INSERT INTO tab_upgraded1 VALUES(51);
+		INSERT INTO tab_upgraded2 VALUES(1);
+]);
+
+$new_sub->start;
+
+# The subscription's running status should be preserved. Old subscription
+# regress_sub1 should be enabled and old subscription regress_sub2 should be
+# disabled.
+$result =
+  $new_sub->safe_psql('postgres',
+	"SELECT subname, subenabled FROM pg_subscription ORDER BY subname");
+is( $result, qq(regress_sub1|t
+regress_sub2|f),
+	"check that the subscription's running status are preserved");
+
+my $sub_oid = $new_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+
+# Subscription relations should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT srrelid, srsubstate FROM pg_subscription_rel WHERE srsubid = $sub_oid ORDER BY srrelid"
+);
+is( $result, qq($tab_upgraded1_oid|r
+$tab_upgraded2_oid|i),
+	"there should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status WHERE external_id = 'pg_' || $sub_oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
+
+# Wait until all tables of subscription 'regress_sub2' are synchronized
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub2');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(1),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+$old_sub->safe_psql(
+	'postgres', qq[
+		ALTER SUBSCRIPTION regress_sub1 DISABLE;
+		ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none);
+		DROP SUBSCRIPTION regress_sub1;
+]);
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+# Drop the subscription
+$old_sub->start;
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if:
+# a) there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE PUBLICATION regress_pub3 FOR TABLE tab_primary_key;
+]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3;
+]);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled = false)"
+);
+$sub_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
+my $reporigin = 'pg_' . qq($sub_oid);
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub3\" schema:\"public\" relation:\"tab_primary_key\"/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub4 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub4\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 77e8b13764..3a1ba14018 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11395,6 +11395,16 @@
   proname => 'binary_upgrade_logical_slot_has_caught_up', provolatile => 'v',
   proparallel => 'u', prorettype => 'bool', proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index f5324b710d..34ec3117a3 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -81,7 +81,7 @@ typedef struct SubscriptionRelState
 } SubscriptionRelState;
 
 extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
-									XLogRecPtr sublsn);
+									XLogRecPtr sublsn, bool retain_lock);
 extern void UpdateSubscriptionRelState(Oid subid, Oid relid, char state,
 									   XLogRecPtr sublsn);
 extern char GetSubscriptionRelState(Oid subid, Oid relid, XLogRecPtr *sublsn);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 38a86575e1..252e01d8f2 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2665,6 +2665,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#172vignesh C
vignesh21@gmail.com
In reply to: Masahiko Sawada (#167)
Re: pg_upgrade and logical replication

On Thu, 7 Dec 2023 at 07:20, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Dec 4, 2023 at 8:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Dec 1, 2023 at 11:24 PM vignesh C <vignesh21@gmail.com> wrote:

The attached v22 version patch has the changes for the same.

I have made minor changes in the comments and code at various places.
See and let me know if you are not happy with the changes. I think
unless there are more suggestions or comments, we can proceed with
committing it.

It seems the patch is already close to ready-to-commit state but I've
had a look at the v23 patch with fresh eyes. It looks mostly good to
me and there are some minor comments:

---
+   tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+   if (!HeapTupleIsValid(tup))
+       ereport(ERROR,
+               errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+               errmsg("relation %u does not exist", relid));
+   ReleaseSysCache(tup);

Given what we want to do here is just an existence check, isn't it
clearer if we use SearchSysCacheExists1() instead?

Modified

---
+        query = createPQExpBuffer();
+        appendPQExpBuffer(query, "SELECT srsubid, srrelid,
srsubstate, srsublsn"
+                                          " FROM
pg_catalog.pg_subscription_rel"
+                                          " ORDER BY srsubid");
+        res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+

Probably we don't need to use PQExpBuffer here since the query to
execute is a static string.

Modified

---
+# The subscription's running status should be preserved. Old subscription
+# regress_sub1 should be enabled and old subscription regress_sub2 should be
+# disabled.
+$result =
+  $new_sub->safe_psql('postgres',
+        "SELECT subenabled FROM pg_subscription ORDER BY subname");
+is( $result, qq(t
+f),
+        "check that the subscription's running status are preserved");
+

How about showing the subname along with the subenabled so that we can
check if each subscription is in an expected state in case where
something error happens?

Modified

---
+# Subscription relations should be preserved
+$result =
+  $new_sub->safe_psql('postgres',
+        "SELECT count(*) FROM pg_subscription_rel WHERE srsubid = $sub_oid");
+is($result, qq(2),
+        "there should be 2 rows in pg_subscription_rel(representing
tab_upgraded1 and tab_upgraded2)"
+);

Is there any reason why we check only the number of rows in
pg_subscription_rel? I guess it might be a good idea to check if table
OIDs there are also preserved.

Modified

---
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
+$publisher->wait_for_catchup('regress_sub2');
+

IIUC after making the subscription regress_sub2 enabled, we will start
the initial table sync for the table tab_upgraded2. If so, shouldn't
we use wait_for_subscription_sync() instead?

Modified

---
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+        "CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr'
PUBLICATION regress_pub3 WITH (enabled=false)"

It's better to put spaces before and after '='.

Modified

---
+my $subid = $old_sub->safe_psql('postgres',
+        "SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");

I think we can reuse $sub_oid.

Modified

Thanks for the comments, the v24 version patch attached at [1]/messages/by-id/CALDaNm27+B6hiCS3g3nUDpfwmTaj6YopSY5ovo2=__iOSpkPbA@mail.gmail.com has the
changes for the same.
[1]: /messages/by-id/CALDaNm27+B6hiCS3g3nUDpfwmTaj6YopSY5ovo2=__iOSpkPbA@mail.gmail.com

Regards,
Vignesh

#173Masahiko Sawada
sawada.mshk@gmail.com
In reply to: vignesh C (#171)
Re: pg_upgrade and logical replication

On Thu, Dec 7, 2023 at 8:15 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, 5 Dec 2023 at 10:56, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:

I have made minor changes in the comments and code at various places.
See and let me know if you are not happy with the changes. I think
unless there are more suggestions or comments, we can proceed with
committing it.

Yeah. I am planning to look more closely at what you have here, and
it is going to take me a bit more time though (some more stuff planned
for next CF, an upcoming conference and end/beginning-of-year
vacations), but I think that targetting the beginning of next CF in
January would be OK.

Overall, I have the impression that the patch looks pretty solid, with
a restriction in place for "init" and "ready" relations, while there
are tests to check all the states that we expect. Seeing coverage
about all that makes me a happy hacker.

+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in binary-upgrade
+ * mode, we expect to release those immediately.

I think that this should be documented in pg_upgrade_support.c where
the caller expects the locks to be released, and why these should be
released. There is a risk that this comment becomes obsolete if
AddSubscriptionRelState() with locks released is called in a different
code path. Anyway, I am not sure to get why this is OK, or even
necessary. It seems like a good practice to keep the locks on the
subscription until the transaction that updates its state. If there's
a specific reason explaining why that's better, the patch should tell
why.

Added comments for this.

+     * However, this shouldn't be a problem as the upgrade ensures
+     * that all the transactions were replicated before upgrading the
+     * publisher.
This wording looks a bit confusing to me, as "the upgrade" could refer
to the upgrade of a subscriber, but what we want to tell is that the
replay of the transactions is enforced when doing a publisher upgrade.
I'd suggest something like "the upgrade of the publisher ensures that
all the transactions were replicated before upgrading it".

Modified

+my $result = $old_sub->safe_psql('postgres',
+   "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
+is($result, qq(t), "Check that the table is in init state");

Hmm. Not sure that this is safe. Shouldn't this be a
poll_query_until(), polling that the state of the relation is what we
want it to be after requesting a fresh of the publication on the
subscriber?

This is not required as the table will be added in init state after
"Alter Subscription ... Refresh .." command itself.

Thanks for the comments, the attached v24 version patch has the
changes for the same.

Thank you for updating the patch.

Here are some minor comments:

+        if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
+                ereport(ERROR,
+                                errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                                errmsg("relation %u does not exist", relid));
+

I think the error code should be ERRCODE_UNDEFINED_TABLE, and the
error message should be something like "relation with OID %u does not
exist". Or we might not need such checks since an undefined-object
error is caught by relation_open()?

---
+        /* Fetch the existing tuple. */
+        tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+                                                  CStringGetDatum(subname));
+        if (!HeapTupleIsValid(tup))
+                ereport(ERROR,
+                                errcode(ERRCODE_UNDEFINED_OBJECT),
+                                errmsg("subscription \"%s\" does not
exist", subname));
+
+        form = (Form_pg_subscription) GETSTRUCT(tup);
+        subid = form->oid;

The above code can be replaced with "get_subscription_oid(subname,
false)". binary_upgrade_replorigin_advance() has the same code.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#174vignesh C
vignesh21@gmail.com
In reply to: Masahiko Sawada (#173)
1 attachment(s)
Re: pg_upgrade and logical replication

On Wed, 13 Dec 2023 at 01:56, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Dec 7, 2023 at 8:15 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, 5 Dec 2023 at 10:56, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:

I have made minor changes in the comments and code at various places.
See and let me know if you are not happy with the changes. I think
unless there are more suggestions or comments, we can proceed with
committing it.

Yeah. I am planning to look more closely at what you have here, and
it is going to take me a bit more time though (some more stuff planned
for next CF, an upcoming conference and end/beginning-of-year
vacations), but I think that targetting the beginning of next CF in
January would be OK.

Overall, I have the impression that the patch looks pretty solid, with
a restriction in place for "init" and "ready" relations, while there
are tests to check all the states that we expect. Seeing coverage
about all that makes me a happy hacker.

+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in binary-upgrade
+ * mode, we expect to release those immediately.

I think that this should be documented in pg_upgrade_support.c where
the caller expects the locks to be released, and why these should be
released. There is a risk that this comment becomes obsolete if
AddSubscriptionRelState() with locks released is called in a different
code path. Anyway, I am not sure to get why this is OK, or even
necessary. It seems like a good practice to keep the locks on the
subscription until the transaction that updates its state. If there's
a specific reason explaining why that's better, the patch should tell
why.

Added comments for this.

+     * However, this shouldn't be a problem as the upgrade ensures
+     * that all the transactions were replicated before upgrading the
+     * publisher.
This wording looks a bit confusing to me, as "the upgrade" could refer
to the upgrade of a subscriber, but what we want to tell is that the
replay of the transactions is enforced when doing a publisher upgrade.
I'd suggest something like "the upgrade of the publisher ensures that
all the transactions were replicated before upgrading it".

Modified

+my $result = $old_sub->safe_psql('postgres',
+   "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
+is($result, qq(t), "Check that the table is in init state");

Hmm. Not sure that this is safe. Shouldn't this be a
poll_query_until(), polling that the state of the relation is what we
want it to be after requesting a fresh of the publication on the
subscriber?

This is not required as the table will be added in init state after
"Alter Subscription ... Refresh .." command itself.

Thanks for the comments, the attached v24 version patch has the
changes for the same.

Thank you for updating the patch.

Here are some minor comments:

+        if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
+                ereport(ERROR,
+                                errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                                errmsg("relation %u does not exist", relid));
+

I think the error code should be ERRCODE_UNDEFINED_TABLE, and the
error message should be something like "relation with OID %u does not
exist". Or we might not need such checks since an undefined-object
error is caught by relation_open()?

I have removed this as it will be caught by relation_open.

---
+        /* Fetch the existing tuple. */
+        tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+                                                  CStringGetDatum(subname));
+        if (!HeapTupleIsValid(tup))
+                ereport(ERROR,
+                                errcode(ERRCODE_UNDEFINED_OBJECT),
+                                errmsg("subscription \"%s\" does not
exist", subname));
+
+        form = (Form_pg_subscription) GETSTRUCT(tup);
+        subid = form->oid;

Modified

The above code can be replaced with "get_subscription_oid(subname,
false)". binary_upgrade_replorigin_advance() has the same code.

Modified

Thanks for the comments, the attached v25 version patch has the
changes for the same.

Regards,
Vignesh

Attachments:

v25-0001-Allow-upgrades-to-preserve-the-full-subscription.patchtext/x-patch; charset=US-ASCII; name=v25-0001-Allow-upgrades-to-preserve-the-full-subscription.patchDownload
From a227419d761e9e121eeaf2db621d227c6abc6dc0 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Thu, 7 Dec 2023 09:50:27 +0530
Subject: [PATCH v25] Allow upgrades to preserve the full subscription's state.

This feature will allow us to replicate the changes on subscriber nodes
after the upgrade.

Previously, only the subscription metadata information was preserved.
Without the list of relations and their state, it's not possible to
re-enable the subscriptions without missing some records as the list of
relations can only be refreshed after enabling the subscription (and
therefore starting the apply worker).  Even if we added a way to refresh
the subscription while enabling a publication, we still wouldn't know
which relations are new on the publication side, and therefore should be
fully synced, and which shouldn't.

To preserve the subscription relations, this patch teaches pg_dump to
restore the content of pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To preserve the replication origins, this patch teaches pg_dump to update
the replication origin along with creating a subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

pg_upgrade will check that all the subscription relations are in 'i'
(init) or in 'r' (ready) state and will error out if that's not the case,
logging the reason for the failure. This helps to avoid the risk of any
dangling slot or origin after the upgrade.

Author: Vignesh C, Julien Rouhaud, Shlok Kyal
Reviewed-by: Peter Smith, Amit Kapila, Michael Paquier, Hayato Kuroda
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  50 ++++
 src/backend/catalog/pg_subscription.c      |  16 +-
 src/backend/commands/subscriptioncmds.c    |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c | 104 +++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 229 ++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  24 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 193 ++++++++++++-
 src/bin/pg_upgrade/info.c                  |  56 +++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 319 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/include/catalog/pg_subscription_rel.h  |   2 +-
 src/tools/pgindent/typedefs.list           |   1 +
 16 files changed, 1029 insertions(+), 15 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 2520f6c50d..87be1fb1c2 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,56 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription's table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription's replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize) or <literal>r</literal> (ready). This
+       can be verified by checking <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d6a978f136..7167377d82 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -228,10 +228,14 @@ textarray_to_stringlist(ArrayType *textarray)
 
 /*
  * Add new state record for a subscription table.
+ *
+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in binary-upgrade
+ * mode, we expect to release those immediately.
  */
 void
 AddSubscriptionRelState(Oid subid, Oid relid, char state,
-						XLogRecPtr sublsn)
+						XLogRecPtr sublsn, bool retain_lock)
 {
 	Relation	rel;
 	HeapTuple	tup;
@@ -269,7 +273,15 @@ AddSubscriptionRelState(Oid subid, Oid relid, char state,
 	heap_freetuple(tup);
 
 	/* Cleanup. */
-	table_close(rel, NoLock);
+	if (retain_lock)
+	{
+		table_close(rel, NoLock);
+	}
+	else
+	{
+		table_close(rel, RowExclusiveLock);
+		UnlockSharedObject(SubscriptionRelationId, subid, 0, AccessShareLock);
+	}
 }
 
 /*
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index edc82c11be..dd067d39ad 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -773,7 +773,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 										 rv->schemaname, rv->relname);
 
 				AddSubscriptionRelState(subid, relid, table_state,
-										InvalidXLogRecPtr);
+										InvalidXLogRecPtr, true);
 			}
 
 			/*
@@ -943,7 +943,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 			{
 				AddSubscriptionRelState(sub->oid, relid,
 										copy_data ? SUBREL_STATE_INIT : SUBREL_STATE_READY,
-										InvalidXLogRecPtr);
+										InvalidXLogRecPtr, true);
 				ereport(DEBUG1,
 						(errmsg_internal("table \"%s.%s\" added to subscription \"%s\"",
 										 rv->schemaname, rv->relname, sub->name)));
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 92921b0239..2afd01648d 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,24 @@
 
 #include "postgres.h"
 
+#include "access/relation.h"
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/lsyscache.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +314,98 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	subrel;
+	Relation	rel;
+	Oid			subid;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	subrel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+	subid = get_subscription_oid(subname, false);
+	rel = relation_open(relid, AccessShareLock);
+
+	/*
+	 * Since there are no concurrent ALTER/DROP SUBSCRIPTION commands during
+	 * the upgrade process, and the apply worker (which builds cache based on
+	 * the subscription catalog) is not running, the locks can be released
+	 * immediately.
+	 */
+	AddSubscriptionRelState(subid, relid, relstate, sublsn, false);
+	relation_close(rel, AccessShareLock);
+
+	table_close(subrel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	Oid			subid;
+	char	   *subname;
+	XLogRecPtr	remote_commit;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+	subid = get_subscription_oid(subname, false);
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster. Shutdown server will flush the origins during shutdown
+	 * checkpoint.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8c0b5486b9..452bd1545e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -297,6 +297,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4618,6 +4619,8 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
+	int			i_subenabled;
 	int			i,
 				ntups;
 
@@ -4673,16 +4676,30 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n"
+							 " s.subenabled\n");
+	else
+		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n"
+							 " false AS subenabled\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4709,6 +4726,8 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
+	i_subenabled = PQfnumber(res, "subenabled");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4746,6 +4765,13 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+		subinfo[i].subenabled =
+			pg_strdup(PQgetvalue(res, i, i_subenabled));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4755,6 +4781,162 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	res = ExecuteSqlQuery(fout,
+						  "SELECT srsubid, srrelid, srsubstate, srsublsn "
+						  "FROM pg_catalog.pg_subscription_rel "
+						  "ORDER BY srsubid",
+						  PGRES_TUPLES_OK);
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+}
+
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade && fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4835,6 +5017,43 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	/*
+	 * In binary-upgrade mode, we allow the replication to continue after the
+	 * upgrade.
+	 */
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+	{
+		if (subinfo->suboriginremotelsn)
+		{
+			/*
+			 * Preserve the remote_lsn for the subscriber's replication
+			 * origin. This value is required to start the replication from
+			 * the position before the upgrade. This value will be stale if
+			 * the publisher gets upgraded before the subscriber node.
+			 * However, this shouldn't be a problem as the upgrade of the
+			 * publisher ensures that all the transactions were replicated
+			 * before upgrading it.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+			appendStringLiteralAH(query, subinfo->dobj.name, fout);
+			appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+		}
+
+		if (strcmp(subinfo->subenabled, "t") == 0)
+		{
+			/*
+			 * Enable the subscription to allow the replication to continue
+			 * after the upgrade.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber's running state.\n");
+			appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s ENABLE;\n", qsubname);
+		}
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10453,6 +10672,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18519,6 +18741,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..20723d3a60 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,		/* see note for SubRelInfo */
 } DumpableObjectType;
 
 /*
@@ -660,6 +661,7 @@ typedef struct _SubscriptionInfo
 {
 	DumpableObject dobj;
 	const char *rolname;
+	char	   *subenabled;
 	char	   *subbinary;
 	char	   *substream;
 	char	   *subtwophasestate;
@@ -671,8 +673,28 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ *
+ * XXX Currently the subscription tables are added to the subscription after
+ * enabling the subscription in binary-upgrade mode. As the apply workers will
+ * not be started in binary_upgrade mode the ordering of enable subscription
+ * does not matter. The order of adding the subscription tables to the
+ * subscription and enable subscription should be taken care if this feature
+ * will be supported in non-binary-upgrade mode in the future.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +719,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +779,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..5a1ebac4b1 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state(void);
 
 
 /*
@@ -112,13 +114,21 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/*
-	 * Logical replication slots can be migrated since PG17. See comments atop
-	 * get_old_cluster_logical_slot_infos().
-	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+	{
+		/*
+		 * Logical replication slots can be migrated since PG17. See comments
+		 * atop get_old_cluster_logical_slot_infos().
+		 */
 		check_old_cluster_for_valid_slots(live_check);
 
+		/*
+		 * Subscriptions and their dependencies can be migrated since PG17.
+		 * See comments atop get_db_subscription_count().
+		 */
+		check_old_cluster_subscription_state();
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -237,6 +247,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1538,6 +1550,53 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions. This is required to create the replication
+ * origin for each subscription.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Subscriptions and their dependencies can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated. */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name = 'max_replication_slots';");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1672,129 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that the replication origin corresponding to each of the
+ * subscriptions are present and each of the subscribed tables is in
+ * 'i' (initialize) or 'r' (ready) state.
+ */
+static void
+check_old_cluster_subscription_state(void)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &old_cluster.dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(&old_cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "The replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * We don't allow upgrade if there is a risk of dangling slot or
+		 * origin corresponding to initial sync after upgrade.
+		 *
+		 * A slot/origin not created yet refers to the 'i' (initialize) state,
+		 * while 'r' (ready) state refers to a slot/origin created previously
+		 * but already dropped. These states are supported for pg_upgrade. The
+		 * other states listed below are not supported:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * used for the slot name won't match anymore.
+		 *
+		 * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+		 * would retain the replication origin when there is a failure in
+		 * tablesync worker immediately after dropping the replication slot in
+		 * the publisher.
+		 *
+		 * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT r.srsubstate, s.subname, n.nspname, c.relname "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "The table sync state \"%s\" is not allowed for database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n",
+					PQgetvalue(res, i, 0),
+					active_db->db_name,
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize) or r (ready) state.\n"
+				 "You can allow the initial sync to finish for all relations and then restart the upgrade.\n"
+				 "A list of the problem subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..cc73c0fc0c 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,6 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
+static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,10 +294,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_db_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -730,6 +735,55 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_db_subscription_count()
+ *
+ * Gets the number of subscriptions of the database referred to by "dbinfo".
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_db_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscriptions for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_db_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..b4ddc20c52
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,319 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+# In a VPATH build, we'll be started in the source directory, but we want
+# to run pg_upgrade in the build directory so that any files generated finish
+# in it, like delete_old_cluster.{sh,bat}.
+chdir ${PostgreSQL::Test::Utils::tmp_check};
+
+# Initial setup
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE TABLE tab_upgraded2(id int);
+]);
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE TABLE tab_upgraded2(id int);
+]);
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+# Setup an enabled subscription to verify that the running status is retained
+# after upgrade.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Verify that the upgrade should be successful with tables in 'ready'/'init'
+# state along with retaining the replication origin remote lsn, and
+# subscription running status.
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub2 FOR TABLE tab_upgraded1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2"
+);
+# Wait till the table tab_upgraded1 reaches 'ready' state
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))");
+$publisher->wait_for_catchup('regress_sub2');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+
+# The table tab_upgraded2 will be in init state as the subscriber
+# configuration for max_logical_replication_workers is set to 0.
+my $result = $old_sub->safe_psql('postgres',
+	"SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
+is($result, qq(t), "Check that the table is in init state");
+
+# Get the replication origin remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub2'"
+);
+# Have the subscription in disabled state before upgrade
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+
+my $tab_upgraded1_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_class WHERE relname = 'tab_upgraded1'");
+my $tab_upgraded2_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_class WHERE relname = 'tab_upgraded2'");
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state (tab_upgraded1 table is in ready state and tab_upgraded2 table is
+# in init state) along with retaining the replication origin remote lsn
+# and subscription running status.
+# ------------------------------------------------------
+command_ok(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in init/ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# ------------------------------------------------------
+# Check that the data inserted to the publisher when the new subscriber is down
+# will be replicated once it is started. Also check that the old subscription
+# states and relations origins are all preserved.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		INSERT INTO tab_upgraded1 VALUES(51);
+		INSERT INTO tab_upgraded2 VALUES(1);
+]);
+
+$new_sub->start;
+
+# The subscription's running status should be preserved. Old subscription
+# regress_sub1 should be enabled and old subscription regress_sub2 should be
+# disabled.
+$result =
+  $new_sub->safe_psql('postgres',
+	"SELECT subname, subenabled FROM pg_subscription ORDER BY subname");
+is( $result, qq(regress_sub1|t
+regress_sub2|f),
+	"check that the subscription's running status are preserved");
+
+my $sub_oid = $new_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+
+# Subscription relations should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT srrelid, srsubstate FROM pg_subscription_rel WHERE srsubid = $sub_oid ORDER BY srrelid"
+);
+is( $result, qq($tab_upgraded1_oid|r
+$tab_upgraded2_oid|i),
+	"there should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status WHERE external_id = 'pg_' || $sub_oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
+
+# Wait until all tables of subscription 'regress_sub2' are synchronized
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub2');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(1),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+$old_sub->safe_psql(
+	'postgres', qq[
+		ALTER SUBSCRIPTION regress_sub1 DISABLE;
+		ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none);
+		DROP SUBSCRIPTION regress_sub1;
+]);
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+# Drop the subscription
+$old_sub->start;
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if:
+# a) there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE PUBLICATION regress_pub3 FOR TABLE tab_primary_key;
+]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3;
+]);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled = false)"
+);
+$sub_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
+my $reporigin = 'pg_' . qq($sub_oid);
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub3\" schema:\"public\" relation:\"tab_primary_key\"/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub4 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub4\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 77e8b13764..3a1ba14018 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11395,6 +11395,16 @@
   proname => 'binary_upgrade_logical_slot_has_caught_up', provolatile => 'v',
   proparallel => 'u', prorettype => 'bool', proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index f5324b710d..34ec3117a3 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -81,7 +81,7 @@ typedef struct SubscriptionRelState
 } SubscriptionRelState;
 
 extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
-									XLogRecPtr sublsn);
+									XLogRecPtr sublsn, bool retain_lock);
 extern void UpdateSubscriptionRelState(Oid subid, Oid relid, char state,
 									   XLogRecPtr sublsn);
 extern char GetSubscriptionRelState(Oid subid, Oid relid, XLogRecPtr *sublsn);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index ba41149b88..db207812c5 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2669,6 +2669,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.34.1

#175Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#174)
1 attachment(s)
Re: pg_upgrade and logical replication

On Wed, Dec 13, 2023 at 12:09 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the comments, the attached v25 version patch has the
changes for the same.

I have looked at it again and made some cosmetic changes like changing
some comments and a minor change in one of the error messages. See, if
the changes look okay to you.

--
With Regards,
Amit Kapila.

Attachments:

v26-0001-Allow-upgrades-to-preserve-the-full-subscription.patchapplication/octet-stream; name=v26-0001-Allow-upgrades-to-preserve-the-full-subscription.patchDownload
From d5ce43ba4c442953e31e4f87330caf52e6caaa88 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Thu, 7 Dec 2023 09:50:27 +0530
Subject: [PATCH v27] Allow upgrades to preserve the full subscription's state.

This feature will allow us to replicate the changes on subscriber nodes
after the upgrade.

Previously, only the subscription metadata information was preserved.
Without the list of relations and their state, it's not possible to
re-enable the subscriptions without missing some records as the list of
relations can only be refreshed after enabling the subscription (and
therefore starting the apply worker).  Even if we added a way to refresh
the subscription while enabling a publication, we still wouldn't know
which relations are new on the publication side, and therefore should be
fully synced, and which shouldn't.

To preserve the subscription relations, this patch teaches pg_dump to
restore the content of pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To preserve the replication origins, this patch teaches pg_dump to update
the replication origin along with creating a subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

pg_upgrade will check that all the subscription relations are in 'i'
(init) or in 'r' (ready) state and will error out if that's not the case,
logging the reason for the failure. This helps to avoid the risk of any
dangling slot or origin after the upgrade.

Author: Vignesh C, Julien Rouhaud, Shlok Kyal
Reviewed-by: Peter Smith, Masahiko Sawada, Amit Kapila, Michael Paquier, Hayato Kuroda
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
---
 doc/src/sgml/ref/pgupgrade.sgml            |  50 ++++
 src/backend/catalog/pg_subscription.c      |  16 +-
 src/backend/commands/subscriptioncmds.c    |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c | 106 +++++++
 src/bin/pg_dump/common.c                   |  22 ++
 src/bin/pg_dump/pg_dump.c                  | 229 ++++++++++++++-
 src/bin/pg_dump/pg_dump.h                  |  24 ++
 src/bin/pg_dump/pg_dump_sort.c             |  11 +-
 src/bin/pg_upgrade/check.c                 | 193 ++++++++++++-
 src/bin/pg_upgrade/info.c                  |  56 +++-
 src/bin/pg_upgrade/meson.build             |   1 +
 src/bin/pg_upgrade/pg_upgrade.h            |   2 +
 src/bin/pg_upgrade/t/004_subscription.pl   | 319 +++++++++++++++++++++
 src/include/catalog/pg_proc.dat            |  10 +
 src/include/catalog/pg_subscription_rel.h  |   2 +-
 src/tools/pgindent/typedefs.list           |   1 +
 16 files changed, 1031 insertions(+), 15 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/004_subscription.pl

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 2520f6c50d..87be1fb1c2 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -456,6 +456,56 @@ make prefix=/usr/local/pgsql.new install
 
    </step>
 
+   <step>
+    <title>Prepare for subscriber upgrades</title>
+
+    <para>
+     Setup the <link linkend="logical-replication-config-subscriber">
+     subscriber configurations</link> in the new subscriber.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription's table information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system catalog and also the subscription's replication origin. This allows
+     logical replication on the new subscriber to continue from where the
+     old subscriber was up to. Migration of subscription dependencies is only
+     supported when the old cluster is version 17.0 or later. Subscription
+     dependencies on clusters before version 17.0 will silently be ignored.
+    </para>
+
+    <para>
+     There are some prerequisites for <application>pg_upgrade</application> to
+     be able to upgrade the subscriptions. If these are not met an error
+     will be reported.
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       All the subscription tables in the old subscriber should be in state
+       <literal>i</literal> (initialize) or <literal>r</literal> (ready). This
+       can be verified by checking <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The new cluster must have
+       <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+       configured to a value greater than or equal to the number of
+       subscriptions present in the old cluster.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </step>
+
    <step>
     <title>Stop both servers</title>
 
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d6a978f136..7167377d82 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -228,10 +228,14 @@ textarray_to_stringlist(ArrayType *textarray)
 
 /*
  * Add new state record for a subscription table.
+ *
+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in binary-upgrade
+ * mode, we expect to release those immediately.
  */
 void
 AddSubscriptionRelState(Oid subid, Oid relid, char state,
-						XLogRecPtr sublsn)
+						XLogRecPtr sublsn, bool retain_lock)
 {
 	Relation	rel;
 	HeapTuple	tup;
@@ -269,7 +273,15 @@ AddSubscriptionRelState(Oid subid, Oid relid, char state,
 	heap_freetuple(tup);
 
 	/* Cleanup. */
-	table_close(rel, NoLock);
+	if (retain_lock)
+	{
+		table_close(rel, NoLock);
+	}
+	else
+	{
+		table_close(rel, RowExclusiveLock);
+		UnlockSharedObject(SubscriptionRelationId, subid, 0, AccessShareLock);
+	}
 }
 
 /*
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index edc82c11be..dd067d39ad 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -773,7 +773,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 										 rv->schemaname, rv->relname);
 
 				AddSubscriptionRelState(subid, relid, table_state,
-										InvalidXLogRecPtr);
+										InvalidXLogRecPtr, true);
 			}
 
 			/*
@@ -943,7 +943,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 			{
 				AddSubscriptionRelState(sub->oid, relid,
 										copy_data ? SUBREL_STATE_INIT : SUBREL_STATE_READY,
-										InvalidXLogRecPtr);
+										InvalidXLogRecPtr, true);
 				ereport(DEBUG1,
 						(errmsg_internal("table \"%s.%s\" added to subscription \"%s\"",
 										 rv->schemaname, rv->relname, sub->name)));
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 92921b0239..14368aafbe 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -11,15 +11,24 @@
 
 #include "postgres.h"
 
+#include "access/relation.h"
+#include "access/table.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/heap.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_subscription_rel.h"
 #include "catalog/pg_type.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "replication/logical.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
+#include "storage/lmgr.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
+#include "utils/lsyscache.h"
+#include "utils/pg_lsn.h"
+#include "utils/syscache.h"
 
 
 #define CHECK_IS_BINARY_UPGRADE									\
@@ -305,3 +314,100 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
+
+/*
+ * binary_upgrade_add_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * catalog.
+ */
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+	Relation	subrel;
+	Relation	rel;
+	Oid			subid;
+	char	   *subname;
+	Oid			relid;
+	char		relstate;
+	XLogRecPtr	sublsn;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/* We must check these things before dereferencing the arguments */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+		elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	relid = PG_GETARG_OID(1);
+	relstate = PG_GETARG_CHAR(2);
+	sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+	subrel = table_open(SubscriptionRelationId, RowExclusiveLock);
+	subid = get_subscription_oid(subname, false);
+	rel = relation_open(relid, AccessShareLock);
+
+	/*
+	 * Since there are no concurrent ALTER/DROP SUBSCRIPTION commands during
+	 * the upgrade process, and the apply worker (which builds cache based on
+	 * the subscription catalog) is not running, the locks can be released
+	 * immediately.
+	 */
+	AddSubscriptionRelState(subid, relid, relstate, sublsn, false);
+	relation_close(rel, AccessShareLock);
+	table_close(subrel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * binary_upgrade_replorigin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
+{
+	Relation	rel;
+	Oid			subid;
+	char	   *subname;
+	char		originname[NAMEDATALEN];
+	RepOriginId node;
+	XLogRecPtr	remote_commit;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	/*
+	 * We must ensure a non-NULL subscription name before dereferencing the
+	 * arguments.
+	 */
+	if (PG_ARGISNULL(0))
+		elog(ERROR, "null argument to binary_upgrade_replorigin_advance is not allowed");
+
+	subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	remote_commit = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);
+
+	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+	subid = get_subscription_oid(subname, false);
+
+	ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
+
+	/* Lock to prevent the replication origin from vanishing */
+	LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	node = replorigin_by_name(originname, false);
+
+	/*
+	 * The server will be stopped after setting up the objects in the new
+	 * cluster and the origins will be flushed during the shutdown checkpoint.
+	 * This will ensure that the latest LSN values for origin will be
+	 * available after the upgrade.
+	 */
+	replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
+					   false /* backward */ ,
+					   false /* WAL log */ );
+
+	UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+	table_close(rel, RowExclusiveLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 8b0c1e7b53..764a39fcb9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator_d.h"
 #include "catalog/pg_proc_d.h"
 #include "catalog/pg_publication_d.h"
+#include "catalog/pg_subscription_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fe_utils/string_utils.h"
@@ -265,6 +266,9 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	pg_log_info("reading subscriptions");
 	getSubscriptions(fout);
 
+	pg_log_info("reading subscription membership of tables");
+	getSubscriptionTables(fout);
+
 	free(inhinfo);				/* not needed any longer */
 
 	*numTablesPtr = numTables;
@@ -978,6 +982,24 @@ findPublicationByOid(Oid oid)
 	return (PublicationInfo *) dobj;
 }
 
+/*
+ * findSubscriptionByOid
+ *	  finds the DumpableObject for the subscription with the given oid
+ *	  returns NULL if not found
+ */
+SubscriptionInfo *
+findSubscriptionByOid(Oid oid)
+{
+	CatalogId	catId;
+	DumpableObject *dobj;
+
+	catId.tableoid = SubscriptionRelationId;
+	catId.oid = oid;
+	dobj = findObjectByCatalogId(catId);
+	Assert(dobj == NULL || dobj->objType == DO_SUBSCRIPTION);
+	return (SubscriptionInfo *) dobj;
+}
+
 
 /*
  * recordExtensionMembership
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8c0b5486b9..452bd1545e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -297,6 +297,7 @@ static void dumpPolicy(Archive *fout, const PolicyInfo *polinfo);
 static void dumpPublication(Archive *fout, const PublicationInfo *pubinfo);
 static void dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo);
 static void dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo);
+static void dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo);
 static void dumpDatabase(Archive *fout);
 static void dumpDatabaseConfig(Archive *AH, PQExpBuffer outbuf,
 							   const char *dbname, Oid dboid);
@@ -4618,6 +4619,8 @@ getSubscriptions(Archive *fout)
 	int			i_subsynccommit;
 	int			i_subpublications;
 	int			i_suborigin;
+	int			i_suboriginremotelsn;
+	int			i_subenabled;
 	int			i,
 				ntups;
 
@@ -4673,16 +4676,30 @@ getSubscriptions(Archive *fout)
 		appendPQExpBufferStr(query,
 							 " s.subpasswordrequired,\n"
 							 " s.subrunasowner,\n"
-							 " s.suborigin\n");
+							 " s.suborigin,\n");
 	else
 		appendPQExpBuffer(query,
 						  " 't' AS subpasswordrequired,\n"
 						  " 't' AS subrunasowner,\n"
-						  " '%s' AS suborigin\n",
+						  " '%s' AS suborigin,\n",
 						  LOGICALREP_ORIGIN_ANY);
 
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n"
+							 " s.subenabled\n");
+	else
+		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n"
+							 " false AS subenabled\n");
+
+	appendPQExpBufferStr(query,
+						 "FROM pg_subscription s\n");
+
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+		appendPQExpBufferStr(query,
+							 "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+							 "    ON o.external_id = 'pg_' || s.oid::text \n");
+
 	appendPQExpBufferStr(query,
-						 "FROM pg_subscription s\n"
 						 "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
 						 "                   WHERE datname = current_database())");
 
@@ -4709,6 +4726,8 @@ getSubscriptions(Archive *fout)
 	i_subsynccommit = PQfnumber(res, "subsynccommit");
 	i_subpublications = PQfnumber(res, "subpublications");
 	i_suborigin = PQfnumber(res, "suborigin");
+	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
+	i_subenabled = PQfnumber(res, "subenabled");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4746,6 +4765,13 @@ getSubscriptions(Archive *fout)
 		subinfo[i].subpublications =
 			pg_strdup(PQgetvalue(res, i, i_subpublications));
 		subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+		if (PQgetisnull(res, i, i_suboriginremotelsn))
+			subinfo[i].suboriginremotelsn = NULL;
+		else
+			subinfo[i].suboriginremotelsn =
+				pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+		subinfo[i].subenabled =
+			pg_strdup(PQgetvalue(res, i, i_subenabled));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4755,6 +4781,162 @@ getSubscriptions(Archive *fout)
 	destroyPQExpBuffer(query);
 }
 
+/*
+ * getSubscriptionTables
+ *	  Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = NULL;
+	SubRelInfo *subrinfo;
+	PGresult   *res;
+	int			i_srsubid;
+	int			i_srrelid;
+	int			i_srsubstate;
+	int			i_srsublsn;
+	int			ntups;
+	Oid			last_srsubid = InvalidOid;
+
+	if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+		fout->remoteVersion < 170000)
+		return;
+
+	res = ExecuteSqlQuery(fout,
+						  "SELECT srsubid, srrelid, srsubstate, srsublsn "
+						  "FROM pg_catalog.pg_subscription_rel "
+						  "ORDER BY srsubid",
+						  PGRES_TUPLES_OK);
+	ntups = PQntuples(res);
+	if (ntups == 0)
+		goto cleanup;
+
+	/* Get pg_subscription_rel attributes */
+	i_srsubid = PQfnumber(res, "srsubid");
+	i_srrelid = PQfnumber(res, "srrelid");
+	i_srsubstate = PQfnumber(res, "srsubstate");
+	i_srsublsn = PQfnumber(res, "srsublsn");
+
+	subrinfo = pg_malloc(ntups * sizeof(SubRelInfo));
+	for (int i = 0; i < ntups; i++)
+	{
+		Oid			cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+		Oid			relid = atooid(PQgetvalue(res, i, i_srrelid));
+		TableInfo  *tblinfo;
+
+		/*
+		 * If we switched to a new subscription, check if the subscription
+		 * exists.
+		 */
+		if (cur_srsubid != last_srsubid)
+		{
+			subinfo = findSubscriptionByOid(cur_srsubid);
+			if (subinfo == NULL)
+				pg_fatal("subscription with OID %u does not exist", cur_srsubid);
+
+			last_srsubid = cur_srsubid;
+		}
+
+		tblinfo = findTableByOid(relid);
+		if (tblinfo == NULL)
+			pg_fatal("failed sanity check, table with OID %u not found",
+					 relid);
+
+		/* OK, make a DumpableObject for this relationship */
+		subrinfo[i].dobj.objType = DO_SUBSCRIPTION_REL;
+		subrinfo[i].dobj.catId.tableoid = relid;
+		subrinfo[i].dobj.catId.oid = cur_srsubid;
+		AssignDumpId(&subrinfo[i].dobj);
+		subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
+		subrinfo[i].tblinfo = tblinfo;
+		subrinfo[i].srsubstate = PQgetvalue(res, i, i_srsubstate)[0];
+		if (PQgetisnull(res, i, i_srsublsn))
+			subrinfo[i].srsublsn = NULL;
+		else
+			subrinfo[i].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
+
+		subrinfo[i].subinfo = subinfo;
+
+		/* Decide whether we want to dump it */
+		selectDumpableObject(&(subrinfo[i].dobj), fout);
+	}
+
+cleanup:
+	PQclear(res);
+}
+
+/*
+ * dumpSubscriptionTable
+ *	  Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	SubscriptionInfo *subinfo = subrinfo->subinfo;
+	PQExpBuffer query;
+	char	   *tag;
+
+	/* Do nothing in data-only dump */
+	if (dopt->dataOnly)
+		return;
+
+	Assert(fout->dopt->binary_upgrade && fout->remoteVersion >= 170000);
+
+	tag = psprintf("%s %s", subinfo->dobj.name, subrinfo->dobj.name);
+
+	query = createPQExpBuffer();
+
+	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+	{
+		/*
+		 * binary_upgrade_add_sub_rel_state will add the subscription relation
+		 * to pg_subscription_rel table. This will be used only in
+		 * binary-upgrade mode.
+		 */
+		appendPQExpBufferStr(query,
+							 "\n-- For binary upgrade, must preserve the subscriber table.\n");
+		appendPQExpBufferStr(query,
+							 "SELECT pg_catalog.binary_upgrade_add_sub_rel_state(");
+		appendStringLiteralAH(query, subrinfo->dobj.name, fout);
+		appendPQExpBuffer(query,
+						  ", %u, '%c'",
+						  subrinfo->tblinfo->dobj.catId.oid,
+						  subrinfo->srsubstate);
+
+		if (subrinfo->srsublsn && subrinfo->srsublsn[0] != '\0')
+			appendPQExpBuffer(query, ", '%s'", subrinfo->srsublsn);
+		else
+			appendPQExpBuffer(query, ", NULL");
+
+		appendPQExpBufferStr(query, ");\n");
+	}
+
+	/*
+	 * There is no point in creating a drop query as the drop is done by table
+	 * drop.  (If you think to change this, see also _printTocEntry().)
+	 * Although this object doesn't really have ownership as such, set the
+	 * owner field anyway to ensure that the command is run by the correct
+	 * role at restore time.
+	 */
+	if (subrinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, subrinfo->dobj.catId, subrinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tag,
+								  .namespace = subrinfo->tblinfo->dobj.namespace->dobj.name,
+								  .owner = subinfo->rolname,
+								  .description = "SUBSCRIPTION TABLE",
+								  .section = SECTION_POST_DATA,
+								  .createStmt = query->data));
+
+	/* These objects can't currently have comments or seclabels */
+
+	free(tag);
+	destroyPQExpBuffer(query);
+}
+
 /*
  * dumpSubscription
  *	  dump the definition of the given subscription
@@ -4835,6 +5017,43 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 
 	appendPQExpBufferStr(query, ");\n");
 
+	/*
+	 * In binary-upgrade mode, we allow the replication to continue after the
+	 * upgrade.
+	 */
+	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+	{
+		if (subinfo->suboriginremotelsn)
+		{
+			/*
+			 * Preserve the remote_lsn for the subscriber's replication
+			 * origin. This value is required to start the replication from
+			 * the position before the upgrade. This value will be stale if
+			 * the publisher gets upgraded before the subscriber node.
+			 * However, this shouldn't be a problem as the upgrade of the
+			 * publisher ensures that all the transactions were replicated
+			 * before upgrading it.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the remote_lsn for the subscriber's replication origin.\n");
+			appendPQExpBufferStr(query,
+								 "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+			appendStringLiteralAH(query, subinfo->dobj.name, fout);
+			appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+		}
+
+		if (strcmp(subinfo->subenabled, "t") == 0)
+		{
+			/*
+			 * Enable the subscription to allow the replication to continue
+			 * after the upgrade.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber's running state.\n");
+			appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s ENABLE;\n", qsubname);
+		}
+	}
+
 	if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
 		ArchiveEntry(fout, subinfo->dobj.catId, subinfo->dobj.dumpId,
 					 ARCHIVE_OPTS(.tag = subinfo->dobj.name,
@@ -10453,6 +10672,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_SUBSCRIPTION_REL:
+			dumpSubscriptionTable(fout, (const SubRelInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -18519,6 +18741,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 			case DO_PUBLICATION_REL:
 			case DO_PUBLICATION_TABLE_IN_SCHEMA:
 			case DO_SUBSCRIPTION:
+			case DO_SUBSCRIPTION_REL:
 				/* Post-data objects: must come after the post-data boundary */
 				addObjectDependency(dobj, postDataBound->dumpId);
 				break;
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..29f39ddcda 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -83,6 +83,7 @@ typedef enum
 	DO_PUBLICATION_REL,
 	DO_PUBLICATION_TABLE_IN_SCHEMA,
 	DO_SUBSCRIPTION,
+	DO_SUBSCRIPTION_REL,		/* see note for SubRelInfo */
 } DumpableObjectType;
 
 /*
@@ -660,6 +661,7 @@ typedef struct _SubscriptionInfo
 {
 	DumpableObject dobj;
 	const char *rolname;
+	char	   *subenabled;
 	char	   *subbinary;
 	char	   *substream;
 	char	   *subtwophasestate;
@@ -671,8 +673,28 @@ typedef struct _SubscriptionInfo
 	char	   *subsynccommit;
 	char	   *subpublications;
 	char	   *suborigin;
+	char	   *suboriginremotelsn;
 } SubscriptionInfo;
 
+/*
+ * The SubRelInfo struct is used to represent a subscription relation.
+ *
+ * XXX Currently, the subscription tables are added to the subscription after
+ * enabling the subscription in binary-upgrade mode. As the apply workers will
+ * not be started in binary_upgrade mode the ordering of enable subscription
+ * does not matter. The order of adding the subscription tables to the
+ * subscription and enabling the subscription should be taken care of if this
+ * feature will be supported in a non-binary-upgrade mode in the future.
+ */
+typedef struct _SubRelInfo
+{
+	DumpableObject dobj;
+	SubscriptionInfo *subinfo;
+	TableInfo  *tblinfo;
+	char		srsubstate;
+	char	   *srsublsn;
+} SubRelInfo;
+
 /*
  *	common utility functions
  */
@@ -697,6 +719,7 @@ extern CollInfo *findCollationByOid(Oid oid);
 extern NamespaceInfo *findNamespaceByOid(Oid oid);
 extern ExtensionInfo *findExtensionByOid(Oid oid);
 extern PublicationInfo *findPublicationByOid(Oid oid);
+extern SubscriptionInfo *findSubscriptionByOid(Oid oid);
 
 extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
 extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
@@ -756,5 +779,6 @@ extern void getPublicationNamespaces(Archive *fout);
 extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
 								 int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);
 
 #endif							/* PG_DUMP_H */
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index abfea15c09..e8d9c8ac86 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -94,6 +94,7 @@ enum dbObjectTypePriorities
 	PRIO_PUBLICATION_REL,
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,
 	PRIO_SUBSCRIPTION,
+	PRIO_SUBSCRIPTION_REL,
 	PRIO_DEFAULT_ACL,			/* done in ACL pass */
 	PRIO_EVENT_TRIGGER,			/* must be next to last! */
 	PRIO_REFRESH_MATVIEW		/* must be last! */
@@ -147,10 +148,11 @@ static const int dbObjectTypePriority[] =
 	PRIO_PUBLICATION,			/* DO_PUBLICATION */
 	PRIO_PUBLICATION_REL,		/* DO_PUBLICATION_REL */
 	PRIO_PUBLICATION_TABLE_IN_SCHEMA,	/* DO_PUBLICATION_TABLE_IN_SCHEMA */
-	PRIO_SUBSCRIPTION			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION,			/* DO_SUBSCRIPTION */
+	PRIO_SUBSCRIPTION_REL		/* DO_SUBSCRIPTION_REL */
 };
 
-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
+StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
 				 "array length mismatch");
 
 static DumpId preDataBoundId;
@@ -1472,6 +1474,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "SUBSCRIPTION (ID %d OID %u)",
 					 obj->dumpId, obj->catId.oid);
 			return;
+		case DO_SUBSCRIPTION_REL:
+			snprintf(buf, bufsize,
+					 "SUBSCRIPTION TABLE (ID %d OID %u)",
+					 obj->dumpId, obj->catId.oid);
+			return;
 		case DO_PRE_DATA_BOUNDARY:
 			snprintf(buf, bufsize,
 					 "PRE-DATA BOUNDARY  (ID %d)",
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fa52aa2c22..87c06628c6 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -34,7 +34,9 @@ static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
 static void check_new_cluster_logical_replication_slots(void);
+static void check_new_cluster_subscription_configuration(void);
 static void check_old_cluster_for_valid_slots(bool live_check);
+static void check_old_cluster_subscription_state(void);
 
 
 /*
@@ -112,13 +114,21 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/*
-	 * Logical replication slots can be migrated since PG17. See comments atop
-	 * get_old_cluster_logical_slot_infos().
-	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+	{
+		/*
+		 * Logical replication slots can be migrated since PG17. See comments
+		 * atop get_old_cluster_logical_slot_infos().
+		 */
 		check_old_cluster_for_valid_slots(live_check);
 
+		/*
+		 * Subscriptions and their dependencies can be migrated since PG17.
+		 * See comments atop get_db_subscription_count().
+		 */
+		check_old_cluster_subscription_state();
+	}
+
 	/*
 	 * PG 16 increased the size of the 'aclitem' type, which breaks the
 	 * on-disk format for existing data.
@@ -237,6 +247,8 @@ check_new_cluster(void)
 	check_for_new_tablespace_dir();
 
 	check_new_cluster_logical_replication_slots();
+
+	check_new_cluster_subscription_configuration();
 }
 
 
@@ -1538,6 +1550,53 @@ check_new_cluster_logical_replication_slots(void)
 	check_ok();
 }
 
+/*
+ * check_new_cluster_subscription_configuration()
+ *
+ * Verify that the max_replication_slots configuration specified is enough for
+ * creating the subscriptions. This is required to create the replication
+ * origin for each subscription.
+ */
+static void
+check_new_cluster_subscription_configuration(void)
+{
+	PGresult   *res;
+	PGconn	   *conn;
+	int			nsubs_on_old;
+	int			max_replication_slots;
+
+	/* Subscriptions and their dependencies can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	nsubs_on_old = count_old_cluster_subscriptions();
+
+	/* Quick return if there are no subscriptions to be migrated. */
+	if (nsubs_on_old == 0)
+		return;
+
+	prep_status("Checking for new cluster configuration for subscriptions");
+
+	conn = connectToServer(&new_cluster, "template1");
+
+	res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+							"WHERE name = 'max_replication_slots';");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine parameter settings on new cluster");
+
+	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
+	if (nsubs_on_old > max_replication_slots)
+		pg_fatal("max_replication_slots (%d) must be greater than or equal to the number of "
+				 "subscriptions (%d) on the old cluster",
+				 max_replication_slots, nsubs_on_old);
+
+	PQclear(res);
+	PQfinish(conn);
+
+	check_ok();
+}
+
 /*
  * check_old_cluster_for_valid_slots()
  *
@@ -1613,3 +1672,129 @@ check_old_cluster_for_valid_slots(bool live_check)
 
 	check_ok();
 }
+
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that the replication origin corresponding to each of the
+ * subscriptions are present and each of the subscribed tables is in
+ * 'i' (initialize) or 'r' (ready) state.
+ */
+static void
+check_old_cluster_subscription_state(void)
+{
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	int			ntup;
+
+	prep_status("Checking for subscription state");
+
+	snprintf(output_path, sizeof(output_path), "%s/%s",
+			 log_opts.basedir,
+			 "subs_invalid.txt");
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		DbInfo	   *active_db = &old_cluster.dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(&old_cluster, active_db->db_name);
+
+		/* We need to check for pg_replication_origin only once. */
+		if (dbnum == 0)
+		{
+			/*
+			 * Check that all the subscriptions have their respective
+			 * replication origin.
+			 */
+			res = executeQueryOrDie(conn,
+									"SELECT d.datname, s.subname "
+									"FROM pg_catalog.pg_subscription s "
+									"LEFT OUTER JOIN pg_catalog.pg_replication_origin o "
+									"	ON o.roname = 'pg_' || s.oid "
+									"INNER JOIN pg_catalog.pg_database d "
+									"	ON d.oid = s.subdbid "
+									"WHERE o.roname iS NULL;");
+
+			ntup = PQntuples(res);
+			for (int i = 0; i < ntup; i++)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s",
+							 output_path, strerror(errno));
+				fprintf(script, "The replication origin is missing for database:\"%s\" subscription:\"%s\"\n",
+						PQgetvalue(res, i, 0),
+						PQgetvalue(res, i, 1));
+			}
+			PQclear(res);
+		}
+
+		/*
+		 * We don't allow upgrade if there is a risk of dangling slot or
+		 * origin corresponding to initial sync after upgrade.
+		 *
+		 * A slot/origin not created yet refers to the 'i' (initialize) state,
+		 * while 'r' (ready) state refers to a slot/origin created previously
+		 * but already dropped. These states are supported for pg_upgrade. The
+		 * other states listed below are not supported:
+		 *
+		 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+		 * would retain a replication slot, which could not be dropped by the
+		 * sync worker spawned after the upgrade because the subscription ID
+		 * used for the slot name won't match anymore.
+		 *
+		 * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+		 * would retain the replication origin when there is a failure in
+		 * tablesync worker immediately after dropping the replication slot in
+		 * the publisher.
+		 *
+		 * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+		 * a relation upgraded while in this state would expect an origin ID
+		 * with the OID of the subscription used before the upgrade, causing
+		 * it to fail.
+		 *
+		 * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+		 * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+		 * so we need not allow these states.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT r.srsubstate, s.subname, n.nspname, c.relname "
+								"FROM pg_catalog.pg_subscription_rel r "
+								"LEFT JOIN pg_catalog.pg_subscription s"
+								"	ON r.srsubid = s.oid "
+								"LEFT JOIN pg_catalog.pg_class c"
+								"	ON r.srrelid = c.oid "
+								"LEFT JOIN pg_catalog.pg_namespace n"
+								"	ON c.relnamespace = n.oid "
+								"WHERE r.srsubstate NOT IN ('i', 'r') "
+								"ORDER BY s.subname");
+
+		ntup = PQntuples(res);
+		for (int i = 0; i < ntup; i++)
+		{
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s",
+						 output_path, strerror(errno));
+
+			fprintf(script, "The table sync state \"%s\" is not allowed for database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n",
+					PQgetvalue(res, i, 0),
+					active_db->db_name,
+					PQgetvalue(res, i, 1),
+					PQgetvalue(res, i, 2),
+					PQgetvalue(res, i, 3));
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains subscriptions without origin or having relations not in i (initialize) or r (ready) state.\n"
+				 "You can allow the initial sync to finish for all relations and then restart the upgrade.\n"
+				 "A list of the problematic subscriptions is in the file:\n"
+				 "    %s", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..f70742851c 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,6 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
+static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,10 +294,14 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		get_rel_infos(cluster, pDbInfo);
 
 		/*
-		 * Retrieve the logical replication slots infos for the old cluster.
+		 * Retrieve the logical replication slots infos and the subscriptions
+		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
+		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
+			get_db_subscription_count(pDbInfo);
+		}
 	}
 
 	if (cluster == &old_cluster)
@@ -730,6 +735,55 @@ count_old_cluster_logical_slots(void)
 	return slot_count;
 }
 
+/*
+ * get_db_subscription_count()
+ *
+ * Gets the number of subscriptions in the database referred to by "dbinfo".
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG17.
+ * This is because before that the logical slots are not upgraded, so we will
+ * not be able to upgrade the logical replication clusters completely.
+ */
+static void
+get_db_subscription_count(DbInfo *dbinfo)
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	/* Subscriptions can be migrated since PG17. */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+		return;
+
+	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	res = executeQueryOrDie(conn, "SELECT count(*) "
+							"FROM pg_catalog.pg_subscription WHERE subdbid = %d",
+							dbinfo->db_oid);
+	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscriptions for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_db_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+	int			nsubs = 0;
+
+	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+	return nsubs;
+}
+
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index 3e8a08e062..32f12f9e27 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -43,6 +43,7 @@ tests += {
       't/001_basic.pl',
       't/002_pg_upgrade.pl',
       't/003_logical_slots.pl',
+      't/004_subscription.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index a710f325de..d63f13fffc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -195,6 +195,7 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
+	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -421,6 +422,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
+int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
new file mode 100644
index 0000000000..d08ffffe10
--- /dev/null
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -0,0 +1,319 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test for pg_upgrade of logical subscription
+use strict;
+use warnings;
+
+use File::Find qw(find);
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
+
+# Initialize publisher node
+my $publisher = PostgreSQL::Test::Cluster->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+# Initialize the old subscriber node
+my $old_sub = PostgreSQL::Test::Cluster->new('old_sub');
+$old_sub->init;
+$old_sub->start;
+my $oldbindir = $old_sub->config_data('--bindir');
+
+# Initialize the new subscriber
+my $new_sub = PostgreSQL::Test::Cluster->new('new_sub');
+$new_sub->init;
+my $newbindir = $new_sub->config_data('--bindir');
+
+# In a VPATH build, we'll be started in the source directory, but we want
+# to run pg_upgrade in the build directory so that any files generated finish
+# in it, like delete_old_cluster.{sh,bat}.
+chdir ${PostgreSQL::Test::Utils::tmp_check};
+
+# Initial setup
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE TABLE tab_upgraded2(id int);
+]);
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE TABLE tab_upgraded2(id int);
+]);
+
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+# Setup an enabled subscription to verify that the running status is retained
+# after upgrade.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Verify that the upgrade should be successful with tables in 'ready'/'init'
+# state along with retaining the replication origin's remote lsn, and
+# subscription's running status.
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION regress_pub2 FOR TABLE tab_upgraded1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2"
+);
+# Wait till the table tab_upgraded1 reaches 'ready' state
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+
+$publisher->safe_psql('postgres',
+	"INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))");
+$publisher->wait_for_catchup('regress_sub2');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+$old_sub->restart;
+
+$publisher->safe_psql('postgres',
+	"ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+$old_sub->safe_psql('postgres',
+	"ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+
+# The table tab_upgraded2 will be in init state as the subscriber
+# configuration for max_logical_replication_workers is set to 0.
+my $result = $old_sub->safe_psql('postgres',
+	"SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
+is($result, qq(t), "Check that the table is in init state");
+
+# Get the replication origin's remote_lsn of the old subscriber
+my $remote_lsn = $old_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub2'"
+);
+# Have the subscription in disabled state before upgrade
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+
+my $tab_upgraded1_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_class WHERE relname = 'tab_upgraded1'");
+my $tab_upgraded2_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_class WHERE relname = 'tab_upgraded2'");
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state (tab_upgraded1 table is in ready state and tab_upgraded2 table is
+# in init state) along with retaining the replication origin's remote lsn
+# and subscription's running status.
+# ------------------------------------------------------
+command_ok(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode
+	],
+	'run of pg_upgrade for old instance when the subscription tables are in init/ready state'
+);
+ok( !-d $new_sub->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after successful pg_upgrade");
+
+# ------------------------------------------------------
+# Check that the data inserted to the publisher when the new subscriber is down
+# will be replicated once it is started. Also check that the old subscription
+# states and relations origins are all preserved.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		INSERT INTO tab_upgraded1 VALUES(51);
+		INSERT INTO tab_upgraded2 VALUES(1);
+]);
+
+$new_sub->start;
+
+# The subscription's running status should be preserved. Old subscription
+# regress_sub1 should be enabled and old subscription regress_sub2 should be
+# disabled.
+$result =
+  $new_sub->safe_psql('postgres',
+	"SELECT subname, subenabled FROM pg_subscription ORDER BY subname");
+is( $result, qq(regress_sub1|t
+regress_sub2|f),
+	"check that the subscription's running status are preserved");
+
+my $sub_oid = $new_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+
+# Subscription relations should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT srrelid, srsubstate FROM pg_subscription_rel WHERE srsubid = $sub_oid ORDER BY srrelid"
+);
+is( $result, qq($tab_upgraded1_oid|r
+$tab_upgraded2_oid|i),
+	"there should be 2 rows in pg_subscription_rel(representing tab_upgraded1 and tab_upgraded2)"
+);
+
+# The replication origin's remote_lsn should be preserved
+$result = $new_sub->safe_psql('postgres',
+	"SELECT remote_lsn FROM pg_replication_origin_status WHERE external_id = 'pg_' || $sub_oid"
+);
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
+
+# Wait until all tables of subscription 'regress_sub2' are synchronized
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub2');
+
+# Rows on tab_upgraded1 and tab_upgraded2 should have been replicated
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded1");
+is($result, qq(51), "check replicated inserts on new subscriber");
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded2");
+is($result, qq(1),
+	"check the data is synced after enabling the subscription for the table that was in init state"
+);
+
+# cleanup
+$new_sub->stop;
+$old_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 4");
+$old_sub->start;
+$old_sub->safe_psql(
+	'postgres', qq[
+		ALTER SUBSCRIPTION regress_sub1 DISABLE;
+		ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none);
+		DROP SUBSCRIPTION regress_sub1;
+]);
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+# Drop the subscription
+$old_sub->start;
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if:
+# a) there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE PUBLICATION regress_pub3 FOR TABLE tab_primary_key;
+]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3;
+]);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled = false)"
+);
+$sub_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
+my $reporigin = 'pg_' . qq($sub_oid);
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync',
+		'-d', $old_sub->data_dir,
+		'-D', $new_sub1->data_dir,
+		'-b', $oldbindir,
+		'-B', $newbindir,
+		'-s', $new_sub1->host,
+		'-p', $old_sub->port,
+		'-P', $new_sub1->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub1->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub3\" schema:\"public\" relation:\"tab_primary_key\"/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub4 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub4\"/m,
+	'the previous test failed due to missing replication origin');
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9052f5262a..5b67784731 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11410,6 +11410,16 @@
   proname => 'binary_upgrade_logical_slot_has_caught_up', provolatile => 'v',
   proparallel => 'u', prorettype => 'bool', proargtypes => 'name',
   prosrc => 'binary_upgrade_logical_slot_has_caught_up' },
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
 
 # conversion functions
 { oid => '4302',
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index f5324b710d..34ec3117a3 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -81,7 +81,7 @@ typedef struct SubscriptionRelState
 } SubscriptionRelState;
 
 extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
-									XLogRecPtr sublsn);
+									XLogRecPtr sublsn, bool retain_lock);
 extern void UpdateSubscriptionRelState(Oid subid, Oid relid, char state,
 									   XLogRecPtr sublsn);
 extern char GetSubscriptionRelState(Oid subid, Oid relid, XLogRecPtr *sublsn);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e37ef9aa76..0a257aa5ce 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2669,6 +2669,7 @@ SubLinkType
 SubOpts
 SubPlan
 SubPlanState
+SubRelInfo
 SubRemoveRels
 SubTransactionId
 SubXactCallback
-- 
2.39.1

#176vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#175)
Re: pg_upgrade and logical replication

On Thu, 28 Dec 2023 at 15:59, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 13, 2023 at 12:09 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the comments, the attached v25 version patch has the
changes for the same.

I have looked at it again and made some cosmetic changes like changing
some comments and a minor change in one of the error messages. See, if
the changes look okay to you.

Thanks, the changes look good.

Regards,
Vignesh

#177Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#176)
Re: pg_upgrade and logical replication

On Fri, Dec 29, 2023 at 2:26 PM vignesh C <vignesh21@gmail.com> wrote:

On Thu, 28 Dec 2023 at 15:59, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 13, 2023 at 12:09 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the comments, the attached v25 version patch has the
changes for the same.

I have looked at it again and made some cosmetic changes like changing
some comments and a minor change in one of the error messages. See, if
the changes look okay to you.

Thanks, the changes look good.

Pushed.

--
With Regards,
Amit Kapila.

#178Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#177)
Re: pg_upgrade and logical replication

On Tue, Jan 02, 2024 at 03:58:25PM +0530, Amit Kapila wrote:

On Fri, Dec 29, 2023 at 2:26 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks, the changes look good.

Pushed.

Yeah! Thanks Amit and everybody involved here! Thanks also to Julien
for raising the thread and the problem, to start with.
--
Michael

#179Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#178)
Re: pg_upgrade and logical replication

On Wed, Jan 3, 2024 at 6:21 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Jan 02, 2024 at 03:58:25PM +0530, Amit Kapila wrote:

On Fri, Dec 29, 2023 at 2:26 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks, the changes look good.

Pushed.

Yeah! Thanks Amit and everybody involved here! Thanks also to Julien
for raising the thread and the problem, to start with.

I think the next possible step here is to document how to upgrade the
logical replication nodes as previously discussed in this thread [1]/messages/by-id/CALDaNm2pe7SoOGtRkrTNsnZPnaaY+2iHC40HBYCSLYmyRg0wSw@mail.gmail.com.
IIRC, there were a few issues with the steps mentioned but if we want
to document those we can start a separate thread for it as that
involves both publishers and subscribers.

[1]: /messages/by-id/CALDaNm2pe7SoOGtRkrTNsnZPnaaY+2iHC40HBYCSLYmyRg0wSw@mail.gmail.com

--
With Regards,
Amit Kapila.

#180Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#179)
Re: pg_upgrade and logical replication

On Wed, Jan 03, 2024 at 11:24:50AM +0530, Amit Kapila wrote:

I think the next possible step here is to document how to upgrade the
logical replication nodes as previously discussed in this thread [1].
IIRC, there were a few issues with the steps mentioned but if we want
to document those we can start a separate thread for it as that
involves both publishers and subscribers.

[1] - /messages/by-id/CALDaNm2pe7SoOGtRkrTNsnZPnaaY+2iHC40HBYCSLYmyRg0wSw@mail.gmail.com

Yep. A second thing is whether it makes sense to have more automated
test coverage when it comes to the interferences between subscribers
and publishers with more complex node structures.
--
Michael

#181Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#180)
Re: pg_upgrade and logical replication

On Wed, Jan 3, 2024 at 11:33 AM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Jan 03, 2024 at 11:24:50AM +0530, Amit Kapila wrote:

I think the next possible step here is to document how to upgrade the
logical replication nodes as previously discussed in this thread [1].
IIRC, there were a few issues with the steps mentioned but if we want
to document those we can start a separate thread for it as that
involves both publishers and subscribers.

[1] - /messages/by-id/CALDaNm2pe7SoOGtRkrTNsnZPnaaY+2iHC40HBYCSLYmyRg0wSw@mail.gmail.com

Yep. A second thing is whether it makes sense to have more automated
test coverage when it comes to the interferences between subscribers
and publishers with more complex node structures.

I think it would be good to finish the pending patch to improve the
IsBinaryUpgrade check [1]/messages/by-id/ZU2TeVkUg5qEi7Oy@paquier.xyz which we decided to do once this patch is
ready. Would you like to take that up or do you want me to finish it?

[1]: /messages/by-id/ZU2TeVkUg5qEi7Oy@paquier.xyz
[2]: /messages/by-id/ZVQtUTdJACnsbbpd@paquier.xyz

--
With Regards,
Amit Kapila.

#182Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#181)
Re: pg_upgrade and logical replication

On Wed, Jan 03, 2024 at 03:18:50PM +0530, Amit Kapila wrote:

I think it would be good to finish the pending patch to improve the
IsBinaryUpgrade check [1] which we decided to do once this patch is
ready. Would you like to take that up or do you want me to finish it?

[1] - /messages/by-id/ZU2TeVkUg5qEi7Oy@paquier.xyz
[2] - /messages/by-id/ZVQtUTdJACnsbbpd@paquier.xyz

Yep, that's on my TODO. I can send a new version at the beginning of
next week. No problem.
--
Michael

#183vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#179)
Re: pg_upgrade and logical replication

On Wed, 3 Jan 2024 at 11:25, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jan 3, 2024 at 6:21 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Jan 02, 2024 at 03:58:25PM +0530, Amit Kapila wrote:

On Fri, Dec 29, 2023 at 2:26 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks, the changes look good.

Pushed.

Yeah! Thanks Amit and everybody involved here! Thanks also to Julien
for raising the thread and the problem, to start with.

I think the next possible step here is to document how to upgrade the
logical replication nodes as previously discussed in this thread [1].
IIRC, there were a few issues with the steps mentioned but if we want
to document those we can start a separate thread for it as that
involves both publishers and subscribers.

I have posted a patch for this at:
/messages/by-id/CALDaNm1_iDO6srWzntqTr0ZDVkk2whVhNKEWAvtgZBfSmuBeZQ@mail.gmail.com

Regards,
Vignesh

#184vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#177)
Re: pg_upgrade and logical replication

On Tue, 2 Jan 2024 at 15:58, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Dec 29, 2023 at 2:26 PM vignesh C <vignesh21@gmail.com> wrote:

On Thu, 28 Dec 2023 at 15:59, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 13, 2023 at 12:09 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the comments, the attached v25 version patch has the
changes for the same.

I have looked at it again and made some cosmetic changes like changing
some comments and a minor change in one of the error messages. See, if
the changes look okay to you.

Thanks, the changes look good.

Pushed.

Thanks for pushing this patch, I have updated the commitfest entry to
Committed for the same.

Regards,
Vignesh

#185Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#181)
Re: pg_upgrade and logical replication

On Wed, Jan 03, 2024 at 03:18:50PM +0530, Amit Kapila wrote:

I think it would be good to finish the pending patch to improve the
IsBinaryUpgrade check [1] which we decided to do once this patch is
ready. Would you like to take that up or do you want me to finish it?

[1] - /messages/by-id/ZU2TeVkUg5qEi7Oy@paquier.xyz
[2] - /messages/by-id/ZVQtUTdJACnsbbpd@paquier.xyz

My apologies for the delay, again. I have sent an update here:
/messages/by-id/ZZ4f3zKu0YyFndHi@paquier.xyz
--
Michael

#186vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#161)
Re: pg_upgrade and logical replication

On Wed, 14 Feb 2024 at 09:07, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Justin,

pg_upgrade/t/004_subscription.pl says

|my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';

..but I think maybe it should not.

When you try to use --link, it fails:
https://cirrus-ci.com/task/4669494061170688

|Adding ".old" suffix to old global/pg_control ok
|
|If you want to start the old cluster, you will need to remove
|the ".old" suffix from
/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_su
bscription_old_sub_data/pgdata/global/pg_control.old.
|Because "link" mode was used, the old cluster cannot be safely
|started once the new cluster has been started.
|...
|
|postgres: could not find the database system
|Expected to find it in the directory
"/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_s
ubscription_old_sub_data/pgdata",
|but could not open file
"/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_s
ubscription_old_sub_data/pgdata/global/pg_control": No such file or directory
|# No postmaster PID for node "old_sub"
|[19:36:01.396](0.250s) Bail out! pg_ctl start failed

Good catch! The primal reason of the failure is to reuse the old cluster, even after
the successful upgrade. The documentation said [1]:

If you use link mode, the upgrade will be much faster (no file copying) and use less
disk space, but you will not be able to access your old cluster once you start the new
cluster after the upgrade.

You could rename pg_control.old to avoid that immediate error, but that doesn't
address the essential issue that "the old cluster cannot be safely started once
the new cluster has been started."

Yeah, I agreed that it should be avoided to access to the old cluster after the upgrade.
IIUC, pg_upgrade would be run third times in 004_subscription.

1. successful upgrade
2. failure due to the insufficient max_replication_slot
3. failure because the pg_subscription_rel has 'd' state

And old instance is reused in all of runs. Therefore, the most reasonable fix is to
change the ordering of tests, i.e., "successful upgrade" should be done at last.

Attached patch modified the test accordingly. Also, it contains some optimizations.

Your proposal to change the tests in the following order: a) failure
due to the insufficient max_replication_slot b) failure because the
pg_subscription_rel has 'd' state c) successful upgrade. looks good to
me.
I have also verified that your changes fixes the issue as the
successful upgrade is moved to the end and the old cluster is no
longer used after upgrade.

One minor suggestion:
There is an extra line break here, this can be removed:
@@ -181,139 +310,5 @@ is($result, qq(1),
"check the data is synced after enabling the subscription for
the table that was in init state"
);

-# cleanup

Regards,
Vignesh

#187Justin Pryzby
pryzby@telsasoft.com
In reply to: Amit Kapila (#177)
Re: pg_upgrade and logical replication

On Tue, Jan 02, 2024 at 03:58:25PM +0530, Amit Kapila wrote:

Pushed.

pg_upgrade/t/004_subscription.pl says

|my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';

..but I think maybe it should not.

When you try to use --link, it fails:
https://cirrus-ci.com/task/4669494061170688

|Adding ".old" suffix to old global/pg_control ok
|
|If you want to start the old cluster, you will need to remove
|the ".old" suffix from /tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_subscription_old_sub_data/pgdata/global/pg_control.old.
|Because "link" mode was used, the old cluster cannot be safely
|started once the new cluster has been started.
|...
|
|postgres: could not find the database system
|Expected to find it in the directory "/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_subscription_old_sub_data/pgdata",
|but could not open file "/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_subscription_old_sub_data/pgdata/global/pg_control": No such file or directory
|# No postmaster PID for node "old_sub"
|[19:36:01.396](0.250s) Bail out! pg_ctl start failed

You could rename pg_control.old to avoid that immediate error, but that doesn't
address the essential issue that "the old cluster cannot be safely started once
the new cluster has been started."

--
Justin

#188Michael Paquier
michael@paquier.xyz
In reply to: Justin Pryzby (#187)
Re: pg_upgrade and logical replication

On Tue, Feb 13, 2024 at 03:05:14PM -0600, Justin Pryzby wrote:

On Tue, Jan 02, 2024 at 03:58:25PM +0530, Amit Kapila wrote:

Pushed.

pg_upgrade/t/004_subscription.pl says

|my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';

..but I think maybe it should not.

When you try to use --link, it fails:
https://cirrus-ci.com/task/4669494061170688

Thanks. It is the kind of things we don't want to lose sight on, so I
have taken this occasion to create a wiki page for the open items of
17, and added this one to it:
https://wiki.postgresql.org/wiki/PostgreSQL_17_Open_Items
--
Michael

#189Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Justin Pryzby (#187)
1 attachment(s)
RE: pg_upgrade and logical replication

Dear Justin,

pg_upgrade/t/004_subscription.pl says

|my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';

..but I think maybe it should not.

When you try to use --link, it fails:
https://cirrus-ci.com/task/4669494061170688

|Adding ".old" suffix to old global/pg_control ok
|
|If you want to start the old cluster, you will need to remove
|the ".old" suffix from
/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_su
bscription_old_sub_data/pgdata/global/pg_control.old.
|Because "link" mode was used, the old cluster cannot be safely
|started once the new cluster has been started.
|...
|
|postgres: could not find the database system
|Expected to find it in the directory
"/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_s
ubscription_old_sub_data/pgdata",
|but could not open file
"/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_s
ubscription_old_sub_data/pgdata/global/pg_control": No such file or directory
|# No postmaster PID for node "old_sub"
|[19:36:01.396](0.250s) Bail out! pg_ctl start failed

Good catch! The primal reason of the failure is to reuse the old cluster, even after
the successful upgrade. The documentation said [1]https://www.postgresql.org/docs/devel/pgupgrade.html:

If you use link mode, the upgrade will be much faster (no file copying) and use less
disk space, but you will not be able to access your old cluster once you start the new
cluster after the upgrade.

You could rename pg_control.old to avoid that immediate error, but that doesn't
address the essential issue that "the old cluster cannot be safely started once
the new cluster has been started."

Yeah, I agreed that it should be avoided to access to the old cluster after the upgrade.
IIUC, pg_upgrade would be run third times in 004_subscription.

1. successful upgrade
2. failure due to the insufficient max_replication_slot
3. failure because the pg_subscription_rel has 'd' state

And old instance is reused in all of runs. Therefore, the most reasonable fix is to
change the ordering of tests, i.e., "successful upgrade" should be done at last.

Attached patch modified the test accordingly. Also, it contains some optimizations.
This can pass the test on my env:

```
pg_upgrade]$ PG_TEST_PG_UPGRADE_MODE='--link' PG_TEST_TIMEOUT_DEFAULT=10 make check PROVE_TESTS='t/004_subscription.pl'
...
# +++ tap check in src/bin/pg_upgrade +++
t/004_subscription.pl .. ok
All tests successful.
Files=1, Tests=14, 9 wallclock secs ( 0.03 usr 0.00 sys + 0.55 cusr 1.08 csys = 1.66 CPU)
Result: PASS
```

How do you think?

[1]: https://www.postgresql.org/docs/devel/pgupgrade.html

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Attachments:

0001-Fix-testcase.patchapplication/octet-stream; name=0001-Fix-testcase.patchDownload
From e8681d84945903bf7cd3d0a03a09f7d5c9da0b84 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Wed, 14 Feb 2024 03:01:08 +0000
Subject: [PATCH] Fix testcase

---
 src/bin/pg_upgrade/t/004_subscription.pl | 303 +++++++++++------------
 1 file changed, 149 insertions(+), 154 deletions(-)

diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
index 63c0a98376..fb8c7bdcde 100644
--- a/src/bin/pg_upgrade/t/004_subscription.pl
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -5,6 +5,7 @@ use strict;
 use warnings FATAL => 'all';
 
 use File::Find qw(find);
+use File::Path qw(rmtree);
 
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
@@ -49,13 +50,140 @@ $old_sub->safe_psql(
 # Setup logical replication
 my $connstr = $publisher->connstr . ' dbname=postgres';
 
-# Setup an enabled subscription to verify that the running status and failover
-# option are retained after the upgrade.
+# Setup a subscription to verify that the failover option are retained after
+# the upgrade.
 $publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
 $old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (failover = true)"
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (failover = true, enabled = false)"
 );
-$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+$old_sub->start;
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if:
+# a) there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		ALTER PUBLICATION regress_pub1 ADD TABLE tab_primary_key;
+]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		ALTER SUBSCRIPTION regress_sub1 ENABLE;
+		ALTER SUBSCRIPTION regress_sub1 REFRESH PUBLICATION;
+]);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+my $sub_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+my $reporigin = 'pg_' . qq($sub_oid);
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub1\" schema:\"public\" relation:\"tab_primary_key\"/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub2\"/m,
+	'the previous test failed due to missing replication origin');
+
+# cleanup
+$old_sub->start;
+$publisher->safe_psql(
+	'postgres', qq[
+		ALTER PUBLICATION regress_pub1 DROP TABLE tab_primary_key;
+		DROP TABLE tab_primary_key;
+]);
+$old_sub->safe_psql(
+	'postgres', qq[
+		DROP SUBSCRIPTION regress_sub2;
+		ALTER SUBSCRIPTION regress_sub1 REFRESH PUBLICATION;
+		DROP TABLE tab_primary_key;
+]);
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
 
 # Verify that the upgrade should be successful with tables in 'ready'/'init'
 # state along with retaining the replication origin's remote lsn, and
@@ -63,7 +191,7 @@ $old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
 $publisher->safe_psql('postgres',
 	"CREATE PUBLICATION regress_pub2 FOR TABLE tab_upgraded1");
 $old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2"
+	"CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub2"
 );
 # Wait till the table tab_upgraded1 reaches 'ready' state
 my $synced_query =
@@ -73,7 +201,7 @@ $old_sub->poll_query_until('postgres', $synced_query)
 
 $publisher->safe_psql('postgres',
 	"INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))");
-$publisher->wait_for_catchup('regress_sub2');
+$publisher->wait_for_catchup('regress_sub3');
 
 # Change configuration to prepare a subscription table in init state
 $old_sub->append_conf('postgresql.conf',
@@ -83,7 +211,7 @@ $old_sub->restart;
 $publisher->safe_psql('postgres',
 	"ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
 $old_sub->safe_psql('postgres',
-	"ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+	"ALTER SUBSCRIPTION regress_sub3 REFRESH PUBLICATION");
 
 # The table tab_upgraded2 will be in init state as the subscriber
 # configuration for max_logical_replication_workers is set to 0.
@@ -93,10 +221,10 @@ is($result, qq(t), "Check that the table is in init state");
 
 # Get the replication origin's remote_lsn of the old subscriber
 my $remote_lsn = $old_sub->safe_psql('postgres',
-	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub2'"
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub3'"
 );
 # Have the subscription in disabled state before upgrade
-$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub3 DISABLE");
 
 my $tab_upgraded1_oid = $old_sub->safe_psql('postgres',
 	"SELECT oid FROM pg_class WHERE relname = 'tab_upgraded1'");
@@ -139,16 +267,17 @@ $new_sub->start;
 
 # The subscription's running status and failover option should be preserved
 # in the upgraded instance. So regress_sub1 should still have subenabled and
-# subfailover set to true, while regress_sub2 should have both set to false.
-$result =
-  $new_sub->safe_psql('postgres',
-	"SELECT subname, subenabled, subfailover FROM pg_subscription ORDER BY subname");
+# subfailover set to true, while regress_sub3 should have both set to false.
+$result = $new_sub->safe_psql('postgres',
+	"SELECT subname, subenabled, subfailover FROM pg_subscription ORDER BY subname"
+);
 is( $result, qq(regress_sub1|t|t
-regress_sub2|f|f),
-	"check that the subscription's running status and failover are preserved");
+regress_sub3|f|f),
+	"check that the subscription's running status and failover are preserved"
+);
 
-my $sub_oid = $new_sub->safe_psql('postgres',
-	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+$sub_oid = $new_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub3'");
 
 # Subscription relations should be preserved
 $result = $new_sub->safe_psql('postgres',
@@ -166,10 +295,10 @@ $result = $new_sub->safe_psql('postgres',
 is($result, qq($remote_lsn), "remote_lsn should have been preserved");
 
 # Enable the subscription
-$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub3 ENABLE");
 
-# Wait until all tables of subscription 'regress_sub2' are synchronized
-$new_sub->wait_for_subscription_sync($publisher, 'regress_sub2');
+# Wait until all tables of subscription 'regress_sub3' are synchronized
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub3');
 
 # Rows on tab_upgraded1 and tab_upgraded2 should have been replicated
 $result =
@@ -181,139 +310,5 @@ is($result, qq(1),
 	"check the data is synced after enabling the subscription for the table that was in init state"
 );
 
-# cleanup
-$new_sub->stop;
-$old_sub->append_conf('postgresql.conf',
-	"max_logical_replication_workers = 4");
-$old_sub->start;
-$old_sub->safe_psql(
-	'postgres', qq[
-		ALTER SUBSCRIPTION regress_sub1 DISABLE;
-		ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none);
-		DROP SUBSCRIPTION regress_sub1;
-]);
-$old_sub->stop;
-
-# ------------------------------------------------------
-# Check that pg_upgrade fails when max_replication_slots configured in the new
-# cluster is less than the number of subscriptions in the old cluster.
-# ------------------------------------------------------
-my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
-$new_sub1->init;
-$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
-
-# pg_upgrade will fail because the new cluster has insufficient
-# max_replication_slots.
-command_checks_all(
-	[
-		'pg_upgrade', '--no-sync',
-		'-d', $old_sub->data_dir,
-		'-D', $new_sub1->data_dir,
-		'-b', $oldbindir,
-		'-B', $newbindir,
-		'-s', $new_sub1->host,
-		'-p', $old_sub->port,
-		'-P', $new_sub1->port,
-		$mode, '--check',
-	],
-	1,
-	[
-		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
-	],
-	[qr//],
-	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
-);
-
-# Reset max_replication_slots
-$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
-
-# Drop the subscription
-$old_sub->start;
-$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");
-
-# ------------------------------------------------------
-# Check that pg_upgrade refuses to run if:
-# a) there's a subscription with tables in a state other than 'r' (ready) or
-#    'i' (init) and/or
-# b) the subscription has no replication origin.
-# ------------------------------------------------------
-$publisher->safe_psql(
-	'postgres', qq[
-		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
-		INSERT INTO tab_primary_key values(1);
-		CREATE PUBLICATION regress_pub3 FOR TABLE tab_primary_key;
-]);
-
-# Insert the same value that is already present in publisher to the primary key
-# column of subscriber so that the table sync will fail.
-$old_sub->safe_psql(
-	'postgres', qq[
-		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
-		INSERT INTO tab_primary_key values(1);
-		CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3;
-]);
-
-# Table will be in 'd' (data is being copied) state as table sync will fail
-# because of primary key constraint error.
-my $started_query =
-  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
-$old_sub->poll_query_until('postgres', $started_query)
-  or die
-  "Timed out while waiting for the table state to become 'd' (datasync)";
-
-# Create another subscription and drop the subscription's replication origin
-$old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled = false)"
-);
-$sub_oid = $old_sub->safe_psql('postgres',
-	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
-my $reporigin = 'pg_' . qq($sub_oid);
-$old_sub->safe_psql('postgres',
-	"SELECT pg_replication_origin_drop('$reporigin')");
-
-$old_sub->stop;
-
-command_fails(
-	[
-		'pg_upgrade', '--no-sync',
-		'-d', $old_sub->data_dir,
-		'-D', $new_sub1->data_dir,
-		'-b', $oldbindir,
-		'-B', $newbindir,
-		'-s', $new_sub1->host,
-		'-p', $old_sub->port,
-		'-P', $new_sub1->port,
-		$mode, '--check',
-	],
-	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
-);
-
-# Verify the reason why the subscriber cannot be upgraded
-my $sub_relstate_filename;
-
-# Find a txt file that contains a list of tables that cannot be upgraded. We
-# cannot predict the file's path because the output directory contains a
-# milliseconds timestamp. File::Find::find must be used.
-find(
-	sub {
-		if ($File::Find::name =~ m/subs_invalid\.txt/)
-		{
-			$sub_relstate_filename = $File::Find::name;
-		}
-	},
-	$new_sub1->data_dir . "/pg_upgrade_output.d");
-
-# Check the file content which should have tab_primary_key table in invalid
-# state.
-like(
-	slurp_file($sub_relstate_filename),
-	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub3\" schema:\"public\" relation:\"tab_primary_key\"/m,
-	'the previous test failed due to subscription table in invalid state');
-
-# Check the file content which should have regress_sub4 subscription.
-like(
-	slurp_file($sub_relstate_filename),
-	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub4\"/m,
-	'the previous test failed due to missing replication origin');
 
 done_testing();
-- 
2.43.0

#190Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: vignesh C (#186)
1 attachment(s)
RE: pg_upgrade and logical replication

Dear Vignesh,

Thanks for verifying the fix!

Your proposal to change the tests in the following order: a) failure
due to the insufficient max_replication_slot b) failure because the
pg_subscription_rel has 'd' state c) successful upgrade. looks good to
me.

Right.

I have also verified that your changes fixes the issue as the
successful upgrade is moved to the end and the old cluster is no
longer used after upgrade.

Yeah, it is same as my expectation.

One minor suggestion:
There is an extra line break here, this can be removed:
@@ -181,139 +310,5 @@ is($result, qq(1),
"check the data is synced after enabling the subscription for
the table that was in init state"
);

-# cleanup

Removed.

PSA a new version patch.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Attachments:

v2-0001-Fix-testcase.patchapplication/octet-stream; name=v2-0001-Fix-testcase.patchDownload
From a9ad4dfd5d6dbe5e02e5a9fbf7b7710d0f5a45b4 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Wed, 14 Feb 2024 03:01:08 +0000
Subject: [PATCH v2] Fix testcase

---
 src/bin/pg_upgrade/t/004_subscription.pl | 304 +++++++++++------------
 1 file changed, 149 insertions(+), 155 deletions(-)

diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
index 63c0a98376..5780cb6fb4 100644
--- a/src/bin/pg_upgrade/t/004_subscription.pl
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -5,6 +5,7 @@ use strict;
 use warnings FATAL => 'all';
 
 use File::Find qw(find);
+use File::Path qw(rmtree);
 
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
@@ -49,13 +50,140 @@ $old_sub->safe_psql(
 # Setup logical replication
 my $connstr = $publisher->connstr . ' dbname=postgres';
 
-# Setup an enabled subscription to verify that the running status and failover
-# option are retained after the upgrade.
+# Setup a subscription to verify that the failover option are retained after
+# the upgrade.
 $publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
 $old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (failover = true)"
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (failover = true, enabled = false)"
 );
-$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+$old_sub->start;
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if:
+# a) there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		ALTER PUBLICATION regress_pub1 ADD TABLE tab_primary_key;
+]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		ALTER SUBSCRIPTION regress_sub1 ENABLE;
+		ALTER SUBSCRIPTION regress_sub1 REFRESH PUBLICATION;
+]);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+my $sub_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+my $reporigin = 'pg_' . qq($sub_oid);
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub1\" schema:\"public\" relation:\"tab_primary_key\"/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub2\"/m,
+	'the previous test failed due to missing replication origin');
+
+# cleanup
+$old_sub->start;
+$publisher->safe_psql(
+	'postgres', qq[
+		ALTER PUBLICATION regress_pub1 DROP TABLE tab_primary_key;
+		DROP TABLE tab_primary_key;
+]);
+$old_sub->safe_psql(
+	'postgres', qq[
+		DROP SUBSCRIPTION regress_sub2;
+		ALTER SUBSCRIPTION regress_sub1 REFRESH PUBLICATION;
+		DROP TABLE tab_primary_key;
+]);
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
 
 # Verify that the upgrade should be successful with tables in 'ready'/'init'
 # state along with retaining the replication origin's remote lsn, and
@@ -63,7 +191,7 @@ $old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
 $publisher->safe_psql('postgres',
 	"CREATE PUBLICATION regress_pub2 FOR TABLE tab_upgraded1");
 $old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2"
+	"CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub2"
 );
 # Wait till the table tab_upgraded1 reaches 'ready' state
 my $synced_query =
@@ -73,7 +201,7 @@ $old_sub->poll_query_until('postgres', $synced_query)
 
 $publisher->safe_psql('postgres',
 	"INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))");
-$publisher->wait_for_catchup('regress_sub2');
+$publisher->wait_for_catchup('regress_sub3');
 
 # Change configuration to prepare a subscription table in init state
 $old_sub->append_conf('postgresql.conf',
@@ -83,7 +211,7 @@ $old_sub->restart;
 $publisher->safe_psql('postgres',
 	"ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
 $old_sub->safe_psql('postgres',
-	"ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+	"ALTER SUBSCRIPTION regress_sub3 REFRESH PUBLICATION");
 
 # The table tab_upgraded2 will be in init state as the subscriber
 # configuration for max_logical_replication_workers is set to 0.
@@ -93,10 +221,10 @@ is($result, qq(t), "Check that the table is in init state");
 
 # Get the replication origin's remote_lsn of the old subscriber
 my $remote_lsn = $old_sub->safe_psql('postgres',
-	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub2'"
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub3'"
 );
 # Have the subscription in disabled state before upgrade
-$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub3 DISABLE");
 
 my $tab_upgraded1_oid = $old_sub->safe_psql('postgres',
 	"SELECT oid FROM pg_class WHERE relname = 'tab_upgraded1'");
@@ -139,16 +267,17 @@ $new_sub->start;
 
 # The subscription's running status and failover option should be preserved
 # in the upgraded instance. So regress_sub1 should still have subenabled and
-# subfailover set to true, while regress_sub2 should have both set to false.
-$result =
-  $new_sub->safe_psql('postgres',
-	"SELECT subname, subenabled, subfailover FROM pg_subscription ORDER BY subname");
+# subfailover set to true, while regress_sub3 should have both set to false.
+$result = $new_sub->safe_psql('postgres',
+	"SELECT subname, subenabled, subfailover FROM pg_subscription ORDER BY subname"
+);
 is( $result, qq(regress_sub1|t|t
-regress_sub2|f|f),
-	"check that the subscription's running status and failover are preserved");
+regress_sub3|f|f),
+	"check that the subscription's running status and failover are preserved"
+);
 
-my $sub_oid = $new_sub->safe_psql('postgres',
-	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+$sub_oid = $new_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub3'");
 
 # Subscription relations should be preserved
 $result = $new_sub->safe_psql('postgres',
@@ -166,10 +295,10 @@ $result = $new_sub->safe_psql('postgres',
 is($result, qq($remote_lsn), "remote_lsn should have been preserved");
 
 # Enable the subscription
-$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub3 ENABLE");
 
-# Wait until all tables of subscription 'regress_sub2' are synchronized
-$new_sub->wait_for_subscription_sync($publisher, 'regress_sub2');
+# Wait until all tables of subscription 'regress_sub3' are synchronized
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub3');
 
 # Rows on tab_upgraded1 and tab_upgraded2 should have been replicated
 $result =
@@ -181,139 +310,4 @@ is($result, qq(1),
 	"check the data is synced after enabling the subscription for the table that was in init state"
 );
 
-# cleanup
-$new_sub->stop;
-$old_sub->append_conf('postgresql.conf',
-	"max_logical_replication_workers = 4");
-$old_sub->start;
-$old_sub->safe_psql(
-	'postgres', qq[
-		ALTER SUBSCRIPTION regress_sub1 DISABLE;
-		ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none);
-		DROP SUBSCRIPTION regress_sub1;
-]);
-$old_sub->stop;
-
-# ------------------------------------------------------
-# Check that pg_upgrade fails when max_replication_slots configured in the new
-# cluster is less than the number of subscriptions in the old cluster.
-# ------------------------------------------------------
-my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
-$new_sub1->init;
-$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
-
-# pg_upgrade will fail because the new cluster has insufficient
-# max_replication_slots.
-command_checks_all(
-	[
-		'pg_upgrade', '--no-sync',
-		'-d', $old_sub->data_dir,
-		'-D', $new_sub1->data_dir,
-		'-b', $oldbindir,
-		'-B', $newbindir,
-		'-s', $new_sub1->host,
-		'-p', $old_sub->port,
-		'-P', $new_sub1->port,
-		$mode, '--check',
-	],
-	1,
-	[
-		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
-	],
-	[qr//],
-	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
-);
-
-# Reset max_replication_slots
-$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
-
-# Drop the subscription
-$old_sub->start;
-$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");
-
-# ------------------------------------------------------
-# Check that pg_upgrade refuses to run if:
-# a) there's a subscription with tables in a state other than 'r' (ready) or
-#    'i' (init) and/or
-# b) the subscription has no replication origin.
-# ------------------------------------------------------
-$publisher->safe_psql(
-	'postgres', qq[
-		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
-		INSERT INTO tab_primary_key values(1);
-		CREATE PUBLICATION regress_pub3 FOR TABLE tab_primary_key;
-]);
-
-# Insert the same value that is already present in publisher to the primary key
-# column of subscriber so that the table sync will fail.
-$old_sub->safe_psql(
-	'postgres', qq[
-		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
-		INSERT INTO tab_primary_key values(1);
-		CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3;
-]);
-
-# Table will be in 'd' (data is being copied) state as table sync will fail
-# because of primary key constraint error.
-my $started_query =
-  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
-$old_sub->poll_query_until('postgres', $started_query)
-  or die
-  "Timed out while waiting for the table state to become 'd' (datasync)";
-
-# Create another subscription and drop the subscription's replication origin
-$old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled = false)"
-);
-$sub_oid = $old_sub->safe_psql('postgres',
-	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
-my $reporigin = 'pg_' . qq($sub_oid);
-$old_sub->safe_psql('postgres',
-	"SELECT pg_replication_origin_drop('$reporigin')");
-
-$old_sub->stop;
-
-command_fails(
-	[
-		'pg_upgrade', '--no-sync',
-		'-d', $old_sub->data_dir,
-		'-D', $new_sub1->data_dir,
-		'-b', $oldbindir,
-		'-B', $newbindir,
-		'-s', $new_sub1->host,
-		'-p', $old_sub->port,
-		'-P', $new_sub1->port,
-		$mode, '--check',
-	],
-	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
-);
-
-# Verify the reason why the subscriber cannot be upgraded
-my $sub_relstate_filename;
-
-# Find a txt file that contains a list of tables that cannot be upgraded. We
-# cannot predict the file's path because the output directory contains a
-# milliseconds timestamp. File::Find::find must be used.
-find(
-	sub {
-		if ($File::Find::name =~ m/subs_invalid\.txt/)
-		{
-			$sub_relstate_filename = $File::Find::name;
-		}
-	},
-	$new_sub1->data_dir . "/pg_upgrade_output.d");
-
-# Check the file content which should have tab_primary_key table in invalid
-# state.
-like(
-	slurp_file($sub_relstate_filename),
-	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub3\" schema:\"public\" relation:\"tab_primary_key\"/m,
-	'the previous test failed due to subscription table in invalid state');
-
-# Check the file content which should have regress_sub4 subscription.
-like(
-	slurp_file($sub_relstate_filename),
-	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub4\"/m,
-	'the previous test failed due to missing replication origin');
-
 done_testing();
-- 
2.43.0

#191Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#189)
Re: pg_upgrade and logical replication

On Wed, Feb 14, 2024 at 9:07 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

pg_upgrade/t/004_subscription.pl says

|my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';

..but I think maybe it should not.

When you try to use --link, it fails:
https://cirrus-ci.com/task/4669494061170688

|Adding ".old" suffix to old global/pg_control ok
|
|If you want to start the old cluster, you will need to remove
|the ".old" suffix from
/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_su
bscription_old_sub_data/pgdata/global/pg_control.old.
|Because "link" mode was used, the old cluster cannot be safely
|started once the new cluster has been started.
|...
|
|postgres: could not find the database system
|Expected to find it in the directory
"/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_s
ubscription_old_sub_data/pgdata",
|but could not open file
"/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_s
ubscription_old_sub_data/pgdata/global/pg_control": No such file or directory
|# No postmaster PID for node "old_sub"
|[19:36:01.396](0.250s) Bail out! pg_ctl start failed

Good catch! The primal reason of the failure is to reuse the old cluster, even after
the successful upgrade. The documentation said [1]:

If you use link mode, the upgrade will be much faster (no file copying) and use less
disk space, but you will not be able to access your old cluster once you start the new
cluster after the upgrade.

You could rename pg_control.old to avoid that immediate error, but that doesn't
address the essential issue that "the old cluster cannot be safely started once
the new cluster has been started."

Yeah, I agreed that it should be avoided to access to the old cluster after the upgrade.
IIUC, pg_upgrade would be run third times in 004_subscription.

1. successful upgrade
2. failure due to the insufficient max_replication_slot
3. failure because the pg_subscription_rel has 'd' state

And old instance is reused in all of runs. Therefore, the most reasonable fix is to
change the ordering of tests, i.e., "successful upgrade" should be done at last.

This sounds like a reasonable way to address the reported problem.
Justin, do let me know if you think otherwise?

Comment:
===========
*
-# Setup an enabled subscription to verify that the running status and failover
-# option are retained after the upgrade.
+# Setup a subscription to verify that the failover option are retained after
+# the upgrade.
 $publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
 $old_sub->safe_psql('postgres',
- "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1 WITH (failover = true)"
+ "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1 WITH (failover = true, enabled = false)"
 );

I think it is better not to create a subscription in the early stage
which we wanted to use for the success case. Let's have separate
subscriptions for failure and success cases. I think that will avoid
the newly added ALTER statements in the patch.

--
With Regards,
Amit Kapila.

#192Justin Pryzby
pryzby@telsasoft.com
In reply to: Hayato Kuroda (Fujitsu) (#189)
Re: pg_upgrade and logical replication

On Wed, Feb 14, 2024 at 03:37:03AM +0000, Hayato Kuroda (Fujitsu) wrote:

Attached patch modified the test accordingly. Also, it contains some optimizations.
This can pass the test on my env:

What optimizations? I can't see them, and since the patch is described
as rearranging test cases (and therefore already difficult to read), I
guess they should be a separate patch, or the optimizations described.

--
Justin

#193Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Amit Kapila (#191)
1 attachment(s)
RE: pg_upgrade and logical replication

Dear Amit,

This sounds like a reasonable way to address the reported problem.

OK, thanks!

Justin, do let me know if you think otherwise?

Comment:
===========
*
-# Setup an enabled subscription to verify that the running status and failover
-# option are retained after the upgrade.
+# Setup a subscription to verify that the failover option are retained after
+# the upgrade.
$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
$old_sub->safe_psql('postgres',
- "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1 WITH (failover = true)"
+ "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
PUBLICATION
regress_pub1 WITH (failover = true, enabled = false)"
);

I think it is better not to create a subscription in the early stage
which we wanted to use for the success case. Let's have separate
subscriptions for failure and success cases. I think that will avoid
the newly added ALTER statements in the patch.

I made a patch to avoid creating objects as much as possible, but it
may lead some confusion. I recreated a patch for creating pub/sub
and dropping them at cleanup for every test cases.

PSA a new version.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Attachments:

v3-0001-Fix-testcase.patchapplication/octet-stream; name=v3-0001-Fix-testcase.patchDownload
From 4c81a0feb4712738f681fe49857304be700e4690 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Wed, 14 Feb 2024 03:01:08 +0000
Subject: [PATCH v3] Fix testcase

---
 src/bin/pg_upgrade/t/004_subscription.pl | 328 ++++++++++++-----------
 1 file changed, 165 insertions(+), 163 deletions(-)

diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
index 63c0a98376..bfcbcb7a7c 100644
--- a/src/bin/pg_upgrade/t/004_subscription.pl
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -5,6 +5,7 @@ use strict;
 use warnings FATAL => 'all';
 
 use File::Find qw(find);
+use File::Path qw(rmtree);
 
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
@@ -49,21 +50,150 @@ $old_sub->safe_psql(
 # Setup logical replication
 my $connstr = $publisher->connstr . ' dbname=postgres';
 
-# Setup an enabled subscription to verify that the running status and failover
-# option are retained after the upgrade.
+# Setup a disabled subscription. The upcoming test will check the
+# pg_createsubscriber won't work, so it is sufficient.
 $publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
 $old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (failover = true)"
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (enabled = false)"
 );
-$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+# Cleanup
+$publisher->safe_psql('postgres', "DROP PUBLICATION regress_pub1");
+$old_sub->start;
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1;");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if:
+# a) there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) the subscription has no replication origin.
+# ------------------------------------------------------
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE PUBLICATION regress_pub2 FOR TABLE tab_primary_key;
+]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2;
+]);
+
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub2 WITH (enabled = false)"
+);
+my $sub_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub3'");
+my $reporigin = 'pg_' . qq($sub_oid);
+$old_sub->safe_psql('postgres',
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
+);
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub2\" schema:\"public\" relation:\"tab_primary_key\"/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub3\"/m,
+	'the previous test failed due to missing replication origin');
+
+# cleanup
+$old_sub->start;
+$publisher->safe_psql(
+	'postgres', qq[
+		DROP PUBLICATION regress_pub2;
+		DROP TABLE tab_primary_key;
+]);
+$old_sub->safe_psql(
+	'postgres', qq[
+		DROP SUBSCRIPTION regress_sub2;
+		DROP SUBSCRIPTION regress_sub3;
+		DROP TABLE tab_primary_key;
+]);
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
 
 # Verify that the upgrade should be successful with tables in 'ready'/'init'
 # state along with retaining the replication origin's remote lsn, and
 # subscription's running status.
 $publisher->safe_psql('postgres',
-	"CREATE PUBLICATION regress_pub2 FOR TABLE tab_upgraded1");
+	"CREATE PUBLICATION regress_pub3 FOR TABLE tab_upgraded1");
 $old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2"
+	"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (failover = true)"
 );
 # Wait till the table tab_upgraded1 reaches 'ready' state
 my $synced_query =
@@ -73,7 +203,7 @@ $old_sub->poll_query_until('postgres', $synced_query)
 
 $publisher->safe_psql('postgres',
 	"INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))");
-$publisher->wait_for_catchup('regress_sub2');
+$publisher->wait_for_catchup('regress_sub4');
 
 # Change configuration to prepare a subscription table in init state
 $old_sub->append_conf('postgresql.conf',
@@ -81,9 +211,10 @@ $old_sub->append_conf('postgresql.conf',
 $old_sub->restart;
 
 $publisher->safe_psql('postgres',
-	"ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+	"CREATE PUBLICATION regress_pub4 FOR TABLE tab_upgraded2");
 $old_sub->safe_psql('postgres',
-	"ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+	"CREATE SUBSCRIPTION regress_sub5 CONNECTION '$connstr' PUBLICATION regress_pub4"
+);
 
 # The table tab_upgraded2 will be in init state as the subscriber
 # configuration for max_logical_replication_workers is set to 0.
@@ -93,10 +224,10 @@ is($result, qq(t), "Check that the table is in init state");
 
 # Get the replication origin's remote_lsn of the old subscriber
 my $remote_lsn = $old_sub->safe_psql('postgres',
-	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub2'"
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub4'"
 );
 # Have the subscription in disabled state before upgrade
-$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 DISABLE");
 
 my $tab_upgraded1_oid = $old_sub->safe_psql('postgres',
 	"SELECT oid FROM pg_class WHERE relname = 'tab_upgraded1'");
@@ -105,6 +236,10 @@ my $tab_upgraded2_oid = $old_sub->safe_psql('postgres',
 
 $old_sub->stop;
 
+# Change configuration as well not to start the initial sync automatically
+$new_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+
 # ------------------------------------------------------
 # Check that pg_upgrade is successful when all tables are in ready or in
 # init state (tab_upgraded1 table is in ready state and tab_upgraded2 table is
@@ -138,21 +273,19 @@ $publisher->safe_psql(
 $new_sub->start;
 
 # The subscription's running status and failover option should be preserved
-# in the upgraded instance. So regress_sub1 should still have subenabled and
-# subfailover set to true, while regress_sub2 should have both set to false.
-$result =
-  $new_sub->safe_psql('postgres',
-	"SELECT subname, subenabled, subfailover FROM pg_subscription ORDER BY subname");
-is( $result, qq(regress_sub1|t|t
-regress_sub2|f|f),
-	"check that the subscription's running status and failover are preserved");
-
-my $sub_oid = $new_sub->safe_psql('postgres',
-	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+# in the upgraded instance. So regress_sub4 should still have subenabled and
+# subfailover set to true, while regress_sub5 should have both set to false.
+$result = $new_sub->safe_psql('postgres',
+	"SELECT subname, subenabled, subfailover FROM pg_subscription ORDER BY subname"
+);
+is( $result, qq(regress_sub4|t|t
+regress_sub5|f|f),
+	"check that the subscription's running status and failover are preserved"
+);
 
 # Subscription relations should be preserved
 $result = $new_sub->safe_psql('postgres',
-	"SELECT srrelid, srsubstate FROM pg_subscription_rel WHERE srsubid = $sub_oid ORDER BY srrelid"
+	"SELECT srrelid, srsubstate FROM pg_subscription_rel ORDER BY srrelid"
 );
 is( $result, qq($tab_upgraded1_oid|r
 $tab_upgraded2_oid|i),
@@ -160,16 +293,20 @@ $tab_upgraded2_oid|i),
 );
 
 # The replication origin's remote_lsn should be preserved
+$sub_oid = $new_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
 $result = $new_sub->safe_psql('postgres',
 	"SELECT remote_lsn FROM pg_replication_origin_status WHERE external_id = 'pg_' || $sub_oid"
 );
 is($result, qq($remote_lsn), "remote_lsn should have been preserved");
 
-# Enable the subscription
-$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
-
-# Wait until all tables of subscription 'regress_sub2' are synchronized
-$new_sub->wait_for_subscription_sync($publisher, 'regress_sub2');
+# Resume the initial sync and wait until all tables of subscription
+# 'regress_sub5' are synchronized
+$new_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 10");
+$new_sub->restart;
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 ENABLE");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');
 
 # Rows on tab_upgraded1 and tab_upgraded2 should have been replicated
 $result =
@@ -181,139 +318,4 @@ is($result, qq(1),
 	"check the data is synced after enabling the subscription for the table that was in init state"
 );
 
-# cleanup
-$new_sub->stop;
-$old_sub->append_conf('postgresql.conf',
-	"max_logical_replication_workers = 4");
-$old_sub->start;
-$old_sub->safe_psql(
-	'postgres', qq[
-		ALTER SUBSCRIPTION regress_sub1 DISABLE;
-		ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none);
-		DROP SUBSCRIPTION regress_sub1;
-]);
-$old_sub->stop;
-
-# ------------------------------------------------------
-# Check that pg_upgrade fails when max_replication_slots configured in the new
-# cluster is less than the number of subscriptions in the old cluster.
-# ------------------------------------------------------
-my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
-$new_sub1->init;
-$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
-
-# pg_upgrade will fail because the new cluster has insufficient
-# max_replication_slots.
-command_checks_all(
-	[
-		'pg_upgrade', '--no-sync',
-		'-d', $old_sub->data_dir,
-		'-D', $new_sub1->data_dir,
-		'-b', $oldbindir,
-		'-B', $newbindir,
-		'-s', $new_sub1->host,
-		'-p', $old_sub->port,
-		'-P', $new_sub1->port,
-		$mode, '--check',
-	],
-	1,
-	[
-		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
-	],
-	[qr//],
-	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
-);
-
-# Reset max_replication_slots
-$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
-
-# Drop the subscription
-$old_sub->start;
-$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");
-
-# ------------------------------------------------------
-# Check that pg_upgrade refuses to run if:
-# a) there's a subscription with tables in a state other than 'r' (ready) or
-#    'i' (init) and/or
-# b) the subscription has no replication origin.
-# ------------------------------------------------------
-$publisher->safe_psql(
-	'postgres', qq[
-		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
-		INSERT INTO tab_primary_key values(1);
-		CREATE PUBLICATION regress_pub3 FOR TABLE tab_primary_key;
-]);
-
-# Insert the same value that is already present in publisher to the primary key
-# column of subscriber so that the table sync will fail.
-$old_sub->safe_psql(
-	'postgres', qq[
-		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
-		INSERT INTO tab_primary_key values(1);
-		CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3;
-]);
-
-# Table will be in 'd' (data is being copied) state as table sync will fail
-# because of primary key constraint error.
-my $started_query =
-  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
-$old_sub->poll_query_until('postgres', $started_query)
-  or die
-  "Timed out while waiting for the table state to become 'd' (datasync)";
-
-# Create another subscription and drop the subscription's replication origin
-$old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled = false)"
-);
-$sub_oid = $old_sub->safe_psql('postgres',
-	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
-my $reporigin = 'pg_' . qq($sub_oid);
-$old_sub->safe_psql('postgres',
-	"SELECT pg_replication_origin_drop('$reporigin')");
-
-$old_sub->stop;
-
-command_fails(
-	[
-		'pg_upgrade', '--no-sync',
-		'-d', $old_sub->data_dir,
-		'-D', $new_sub1->data_dir,
-		'-b', $oldbindir,
-		'-B', $newbindir,
-		'-s', $new_sub1->host,
-		'-p', $old_sub->port,
-		'-P', $new_sub1->port,
-		$mode, '--check',
-	],
-	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
-);
-
-# Verify the reason why the subscriber cannot be upgraded
-my $sub_relstate_filename;
-
-# Find a txt file that contains a list of tables that cannot be upgraded. We
-# cannot predict the file's path because the output directory contains a
-# milliseconds timestamp. File::Find::find must be used.
-find(
-	sub {
-		if ($File::Find::name =~ m/subs_invalid\.txt/)
-		{
-			$sub_relstate_filename = $File::Find::name;
-		}
-	},
-	$new_sub1->data_dir . "/pg_upgrade_output.d");
-
-# Check the file content which should have tab_primary_key table in invalid
-# state.
-like(
-	slurp_file($sub_relstate_filename),
-	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub3\" schema:\"public\" relation:\"tab_primary_key\"/m,
-	'the previous test failed due to subscription table in invalid state');
-
-# Check the file content which should have regress_sub4 subscription.
-like(
-	slurp_file($sub_relstate_filename),
-	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub4\"/m,
-	'the previous test failed due to missing replication origin');
-
 done_testing();
-- 
2.43.0

#194Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Justin Pryzby (#192)
RE: pg_upgrade and logical replication

Dear Justin,

Thanks for replying!

What optimizations? I can't see them, and since the patch is described
as rearranging test cases (and therefore already difficult to read), I
guess they should be a separate patch, or the optimizations described.

The basic idea was to reduce number of CREATE/DROP statement,
but it was changed for now - publications and subscriptions were created and
dropped per testcases.

E.g., In case of successful upgrade, below steps were done:

1. create two publications
2. create a subscription with failover = true
3. avoid further initial sync by setting max_logical_replication_workers = 0
4. create another subscription
5. confirm statuses of tables are either of 'i' or 'r'
6. run pg_upgrade
7. confirm table statuses are preserved
8. confirm replication origins are preserved.

New patch is available in [1]/messages/by-id/TYCPR01MB12077B16EEDA360BA645B96F8F54C2@TYCPR01MB12077.jpnprd01.prod.outlook.com.

[1]: /messages/by-id/TYCPR01MB12077B16EEDA360BA645B96F8F54C2@TYCPR01MB12077.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

#195vignesh C
vignesh21@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#193)
Re: pg_upgrade and logical replication

On Fri, 16 Feb 2024 at 08:22, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Amit,

This sounds like a reasonable way to address the reported problem.

OK, thanks!

Justin, do let me know if you think otherwise?

Comment:
===========
*
-# Setup an enabled subscription to verify that the running status and failover
-# option are retained after the upgrade.
+# Setup a subscription to verify that the failover option are retained after
+# the upgrade.
$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
$old_sub->safe_psql('postgres',
- "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1 WITH (failover = true)"
+ "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
PUBLICATION
regress_pub1 WITH (failover = true, enabled = false)"
);

I think it is better not to create a subscription in the early stage
which we wanted to use for the success case. Let's have separate
subscriptions for failure and success cases. I think that will avoid
the newly added ALTER statements in the patch.

I made a patch to avoid creating objects as much as possible, but it
may lead some confusion. I recreated a patch for creating pub/sub
and dropping them at cleanup for every test cases.

PSA a new version.

Thanks for the updated patch, few suggestions:
1) Can we use a new publication for this subscription too so that the
publication and subscription naming will become consistent throughout
the test case:
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr'
PUBLICATION regress_pub2 WITH (enabled = false)"
+);

So after the change it will become like subscription regress_sub3 for
publication regress_pub3, subscription regress_sub4 for publication
regress_pub4 and subscription regress_sub5 for publication
regress_pub5.

2) The tab_upgraded1 table can be created along with create
publication and create subscription itself:
$publisher->safe_psql('postgres',
"CREATE PUBLICATION regress_pub3 FOR TABLE tab_upgraded1");
$old_sub->safe_psql('postgres',
"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION
regress_pub3 WITH (failover = true)"
);

3) The tab_upgraded2 table can be created along with create
publication and create subscription itself to keep it consistent:
 $publisher->safe_psql('postgres',
-       "ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+       "CREATE PUBLICATION regress_pub4 FOR TABLE tab_upgraded2");
 $old_sub->safe_psql('postgres',
-       "ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+       "CREATE SUBSCRIPTION regress_sub5 CONNECTION '$connstr'
PUBLICATION regress_pub4"
+);

With above fixes, the following can be removed:
# Initial setup
$publisher->safe_psql(
'postgres', qq[
CREATE TABLE tab_upgraded1(id int);
CREATE TABLE tab_upgraded2(id int);
]);
$old_sub->safe_psql(
'postgres', qq[
CREATE TABLE tab_upgraded1(id int);
CREATE TABLE tab_upgraded2(id int);
]);

Regards,
Vignesh

#196Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: vignesh C (#195)
1 attachment(s)
RE: pg_upgrade and logical replication

Dear Vignesh,

Thanks for reviewing! PSA new version.

Thanks for the updated patch, few suggestions:
1) Can we use a new publication for this subscription too so that the
publication and subscription naming will become consistent throughout
the test case:
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr'
PUBLICATION regress_pub2 WITH (enabled = false)"
+);

So after the change it will become like subscription regress_sub3 for
publication regress_pub3, subscription regress_sub4 for publication
regress_pub4 and subscription regress_sub5 for publication
regress_pub5.

A new publication was defined.

2) The tab_upgraded1 table can be created along with create
publication and create subscription itself:
$publisher->safe_psql('postgres',
"CREATE PUBLICATION regress_pub3 FOR TABLE tab_upgraded1");
$old_sub->safe_psql('postgres',
"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION
regress_pub3 WITH (failover = true)"
);

The definition of tab_upgraded1 was moved to the place you pointed.

3) The tab_upgraded2 table can be created along with create
publication and create subscription itself to keep it consistent:
$publisher->safe_psql('postgres',
-       "ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+       "CREATE PUBLICATION regress_pub4 FOR TABLE tab_upgraded2");
$old_sub->safe_psql('postgres',
-       "ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+       "CREATE SUBSCRIPTION regress_sub5 CONNECTION '$connstr'
PUBLICATION regress_pub4"
+);

Ditto.

With above fixes, the following can be removed:
# Initial setup
$publisher->safe_psql(
'postgres', qq[
CREATE TABLE tab_upgraded1(id int);
CREATE TABLE tab_upgraded2(id int);
]);
$old_sub->safe_psql(
'postgres', qq[
CREATE TABLE tab_upgraded1(id int);
CREATE TABLE tab_upgraded2(id int);
]);

Yes, earlier definitions were removed instead.
Also, some comments were adjusted based on these fixes.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Attachments:

v4-0001-Fix-testcase.patchapplication/octet-stream; name=v4-0001-Fix-testcase.patchDownload
From 4f23ca3819d58a29efa4172c80d8ff8925936c65 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Wed, 14 Feb 2024 03:01:08 +0000
Subject: [PATCH v4] Fix testcase

---
 src/bin/pg_upgrade/t/004_subscription.pl | 358 ++++++++++++-----------
 1 file changed, 182 insertions(+), 176 deletions(-)

diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
index 63c0a98376..25d9e2d6bf 100644
--- a/src/bin/pg_upgrade/t/004_subscription.pl
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -5,6 +5,7 @@ use strict;
 use warnings FATAL => 'all';
 
 use File::Find qw(find);
+use File::Path qw(rmtree);
 
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
@@ -34,37 +35,164 @@ my $newbindir = $new_sub->config_data('--bindir');
 # in it, like delete_old_cluster.{sh,bat}.
 chdir ${PostgreSQL::Test::Utils::tmp_check};
 
-# Initial setup
+# Setup logical replication
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+# Setup a disabled subscription. The upcoming test will check the
+# pg_createsubscriber won't work, so it is sufficient.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+# Cleanup
+$publisher->safe_psql('postgres', "DROP PUBLICATION regress_pub1");
+$old_sub->start;
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1;");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if:
+# a) there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) the subscription has no replication origin.
+# ------------------------------------------------------
 $publisher->safe_psql(
 	'postgres', qq[
-		CREATE TABLE tab_upgraded1(id int);
-		CREATE TABLE tab_upgraded2(id int);
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE PUBLICATION regress_pub2 FOR TABLE tab_primary_key;
 ]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
 $old_sub->safe_psql(
 	'postgres', qq[
-		CREATE TABLE tab_upgraded1(id int);
-		CREATE TABLE tab_upgraded2(id int);
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2;
 ]);
 
-# Setup logical replication
-my $connstr = $publisher->connstr . ' dbname=postgres';
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
 
-# Setup an enabled subscription to verify that the running status and failover
-# option are retained after the upgrade.
-$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+# Setup another logical replication and drop the subscription's replication
+# origin.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub3");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled = false)"
+);
+my $sub_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub3'");
+my $reporigin = 'pg_' . qq($sub_oid);
 $old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (failover = true)"
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
 );
-$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub2\" schema:\"public\" relation:\"tab_primary_key\"/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub3\"/m,
+	'the previous test failed due to missing replication origin');
+
+# Cleanup
+$old_sub->start;
+$publisher->safe_psql(
+	'postgres', qq[
+		DROP PUBLICATION regress_pub2;
+		DROP PUBLICATION regress_pub3;
+		DROP TABLE tab_primary_key;
+]);
+$old_sub->safe_psql(
+	'postgres', qq[
+		DROP SUBSCRIPTION regress_sub2;
+		DROP SUBSCRIPTION regress_sub3;
+		DROP TABLE tab_primary_key;
+]);
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
 
 # Verify that the upgrade should be successful with tables in 'ready'/'init'
 # state along with retaining the replication origin's remote lsn, and
 # subscription's running status.
-$publisher->safe_psql('postgres',
-	"CREATE PUBLICATION regress_pub2 FOR TABLE tab_upgraded1");
-$old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2"
-);
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE PUBLICATION regress_pub4 FOR TABLE tab_upgraded1;
+]);
+
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub4 WITH (failover = true);
+]);
+
 # Wait till the table tab_upgraded1 reaches 'ready' state
 my $synced_query =
   "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
@@ -73,17 +201,24 @@ $old_sub->poll_query_until('postgres', $synced_query)
 
 $publisher->safe_psql('postgres',
 	"INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))");
-$publisher->wait_for_catchup('regress_sub2');
+$publisher->wait_for_catchup('regress_sub4');
 
 # Change configuration to prepare a subscription table in init state
 $old_sub->append_conf('postgresql.conf',
 	"max_logical_replication_workers = 0");
 $old_sub->restart;
 
-$publisher->safe_psql('postgres',
-	"ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
-$old_sub->safe_psql('postgres',
-	"ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+# Setup another logical replication
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded2(id int);
+		CREATE PUBLICATION regress_pub5 FOR TABLE tab_upgraded2;
+]);
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded2(id int);
+		CREATE SUBSCRIPTION regress_sub5 CONNECTION '$connstr' PUBLICATION regress_pub5;
+]);
 
 # The table tab_upgraded2 will be in init state as the subscriber
 # configuration for max_logical_replication_workers is set to 0.
@@ -93,10 +228,10 @@ is($result, qq(t), "Check that the table is in init state");
 
 # Get the replication origin's remote_lsn of the old subscriber
 my $remote_lsn = $old_sub->safe_psql('postgres',
-	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub2'"
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub4'"
 );
 # Have the subscription in disabled state before upgrade
-$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 DISABLE");
 
 my $tab_upgraded1_oid = $old_sub->safe_psql('postgres',
 	"SELECT oid FROM pg_class WHERE relname = 'tab_upgraded1'");
@@ -105,6 +240,10 @@ my $tab_upgraded2_oid = $old_sub->safe_psql('postgres',
 
 $old_sub->stop;
 
+# Change configuration as well not to start the initial sync automatically
+$new_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+
 # ------------------------------------------------------
 # Check that pg_upgrade is successful when all tables are in ready or in
 # init state (tab_upgraded1 table is in ready state and tab_upgraded2 table is
@@ -138,21 +277,19 @@ $publisher->safe_psql(
 $new_sub->start;
 
 # The subscription's running status and failover option should be preserved
-# in the upgraded instance. So regress_sub1 should still have subenabled and
-# subfailover set to true, while regress_sub2 should have both set to false.
-$result =
-  $new_sub->safe_psql('postgres',
-	"SELECT subname, subenabled, subfailover FROM pg_subscription ORDER BY subname");
-is( $result, qq(regress_sub1|t|t
-regress_sub2|f|f),
-	"check that the subscription's running status and failover are preserved");
-
-my $sub_oid = $new_sub->safe_psql('postgres',
-	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+# in the upgraded instance. So regress_sub4 should still have subenabled and
+# subfailover set to true, while regress_sub5 should have both set to false.
+$result = $new_sub->safe_psql('postgres',
+	"SELECT subname, subenabled, subfailover FROM pg_subscription ORDER BY subname"
+);
+is( $result, qq(regress_sub4|t|t
+regress_sub5|f|f),
+	"check that the subscription's running status and failover are preserved"
+);
 
 # Subscription relations should be preserved
 $result = $new_sub->safe_psql('postgres',
-	"SELECT srrelid, srsubstate FROM pg_subscription_rel WHERE srsubid = $sub_oid ORDER BY srrelid"
+	"SELECT srrelid, srsubstate FROM pg_subscription_rel ORDER BY srrelid"
 );
 is( $result, qq($tab_upgraded1_oid|r
 $tab_upgraded2_oid|i),
@@ -160,16 +297,20 @@ $tab_upgraded2_oid|i),
 );
 
 # The replication origin's remote_lsn should be preserved
+$sub_oid = $new_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
 $result = $new_sub->safe_psql('postgres',
 	"SELECT remote_lsn FROM pg_replication_origin_status WHERE external_id = 'pg_' || $sub_oid"
 );
 is($result, qq($remote_lsn), "remote_lsn should have been preserved");
 
-# Enable the subscription
-$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
-
-# Wait until all tables of subscription 'regress_sub2' are synchronized
-$new_sub->wait_for_subscription_sync($publisher, 'regress_sub2');
+# Resume the initial sync and wait until all tables of subscription
+# 'regress_sub5' are synchronized
+$new_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 10");
+$new_sub->restart;
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 ENABLE");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');
 
 # Rows on tab_upgraded1 and tab_upgraded2 should have been replicated
 $result =
@@ -181,139 +322,4 @@ is($result, qq(1),
 	"check the data is synced after enabling the subscription for the table that was in init state"
 );
 
-# cleanup
-$new_sub->stop;
-$old_sub->append_conf('postgresql.conf',
-	"max_logical_replication_workers = 4");
-$old_sub->start;
-$old_sub->safe_psql(
-	'postgres', qq[
-		ALTER SUBSCRIPTION regress_sub1 DISABLE;
-		ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none);
-		DROP SUBSCRIPTION regress_sub1;
-]);
-$old_sub->stop;
-
-# ------------------------------------------------------
-# Check that pg_upgrade fails when max_replication_slots configured in the new
-# cluster is less than the number of subscriptions in the old cluster.
-# ------------------------------------------------------
-my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
-$new_sub1->init;
-$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
-
-# pg_upgrade will fail because the new cluster has insufficient
-# max_replication_slots.
-command_checks_all(
-	[
-		'pg_upgrade', '--no-sync',
-		'-d', $old_sub->data_dir,
-		'-D', $new_sub1->data_dir,
-		'-b', $oldbindir,
-		'-B', $newbindir,
-		'-s', $new_sub1->host,
-		'-p', $old_sub->port,
-		'-P', $new_sub1->port,
-		$mode, '--check',
-	],
-	1,
-	[
-		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
-	],
-	[qr//],
-	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
-);
-
-# Reset max_replication_slots
-$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
-
-# Drop the subscription
-$old_sub->start;
-$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");
-
-# ------------------------------------------------------
-# Check that pg_upgrade refuses to run if:
-# a) there's a subscription with tables in a state other than 'r' (ready) or
-#    'i' (init) and/or
-# b) the subscription has no replication origin.
-# ------------------------------------------------------
-$publisher->safe_psql(
-	'postgres', qq[
-		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
-		INSERT INTO tab_primary_key values(1);
-		CREATE PUBLICATION regress_pub3 FOR TABLE tab_primary_key;
-]);
-
-# Insert the same value that is already present in publisher to the primary key
-# column of subscriber so that the table sync will fail.
-$old_sub->safe_psql(
-	'postgres', qq[
-		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
-		INSERT INTO tab_primary_key values(1);
-		CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3;
-]);
-
-# Table will be in 'd' (data is being copied) state as table sync will fail
-# because of primary key constraint error.
-my $started_query =
-  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
-$old_sub->poll_query_until('postgres', $started_query)
-  or die
-  "Timed out while waiting for the table state to become 'd' (datasync)";
-
-# Create another subscription and drop the subscription's replication origin
-$old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled = false)"
-);
-$sub_oid = $old_sub->safe_psql('postgres',
-	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
-my $reporigin = 'pg_' . qq($sub_oid);
-$old_sub->safe_psql('postgres',
-	"SELECT pg_replication_origin_drop('$reporigin')");
-
-$old_sub->stop;
-
-command_fails(
-	[
-		'pg_upgrade', '--no-sync',
-		'-d', $old_sub->data_dir,
-		'-D', $new_sub1->data_dir,
-		'-b', $oldbindir,
-		'-B', $newbindir,
-		'-s', $new_sub1->host,
-		'-p', $old_sub->port,
-		'-P', $new_sub1->port,
-		$mode, '--check',
-	],
-	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
-);
-
-# Verify the reason why the subscriber cannot be upgraded
-my $sub_relstate_filename;
-
-# Find a txt file that contains a list of tables that cannot be upgraded. We
-# cannot predict the file's path because the output directory contains a
-# milliseconds timestamp. File::Find::find must be used.
-find(
-	sub {
-		if ($File::Find::name =~ m/subs_invalid\.txt/)
-		{
-			$sub_relstate_filename = $File::Find::name;
-		}
-	},
-	$new_sub1->data_dir . "/pg_upgrade_output.d");
-
-# Check the file content which should have tab_primary_key table in invalid
-# state.
-like(
-	slurp_file($sub_relstate_filename),
-	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub3\" schema:\"public\" relation:\"tab_primary_key\"/m,
-	'the previous test failed due to subscription table in invalid state');
-
-# Check the file content which should have regress_sub4 subscription.
-like(
-	slurp_file($sub_relstate_filename),
-	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub4\"/m,
-	'the previous test failed due to missing replication origin');
-
 done_testing();
-- 
2.43.0

#197Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#196)
Re: pg_upgrade and logical replication

On Fri, Feb 16, 2024 at 10:50 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Thanks for reviewing! PSA new version.

+# Setup a disabled subscription. The upcoming test will check the
+# pg_createsubscriber won't work, so it is sufficient.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");

Why pg_createsubscriber is referred to here? I think it is a typo.

Other than that patch looks good to me.

--
With Regards,
Amit Kapila.

#198vignesh C
vignesh21@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#196)
Re: pg_upgrade and logical replication

On Fri, 16 Feb 2024 at 10:50, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Vignesh,

Thanks for reviewing! PSA new version.

Thanks for the updated patch, few suggestions:
1) Can we use a new publication for this subscription too so that the
publication and subscription naming will become consistent throughout
the test case:
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr'
PUBLICATION regress_pub2 WITH (enabled = false)"
+);

So after the change it will become like subscription regress_sub3 for
publication regress_pub3, subscription regress_sub4 for publication
regress_pub4 and subscription regress_sub5 for publication
regress_pub5.

A new publication was defined.

2) The tab_upgraded1 table can be created along with create
publication and create subscription itself:
$publisher->safe_psql('postgres',
"CREATE PUBLICATION regress_pub3 FOR TABLE tab_upgraded1");
$old_sub->safe_psql('postgres',
"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION
regress_pub3 WITH (failover = true)"
);

The definition of tab_upgraded1 was moved to the place you pointed.

3) The tab_upgraded2 table can be created along with create
publication and create subscription itself to keep it consistent:
$publisher->safe_psql('postgres',
-       "ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+       "CREATE PUBLICATION regress_pub4 FOR TABLE tab_upgraded2");
$old_sub->safe_psql('postgres',
-       "ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+       "CREATE SUBSCRIPTION regress_sub5 CONNECTION '$connstr'
PUBLICATION regress_pub4"
+);

Ditto.

With above fixes, the following can be removed:
# Initial setup
$publisher->safe_psql(
'postgres', qq[
CREATE TABLE tab_upgraded1(id int);
CREATE TABLE tab_upgraded2(id int);
]);
$old_sub->safe_psql(
'postgres', qq[
CREATE TABLE tab_upgraded1(id int);
CREATE TABLE tab_upgraded2(id int);
]);

Yes, earlier definitions were removed instead.
Also, some comments were adjusted based on these fixes.

Thanks for the updated patch, Few suggestions:
1)  This can be moved to keep it similar to other tests:
+# Setup a disabled subscription. The upcoming test will check the
+# pg_createsubscriber won't work, so it is sufficient.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+       [
+               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+               '-D', $new_sub->data_dir, '-b', $oldbindir,
+               '-B', $newbindir, '-s', $new_sub->host,
+               '-p', $old_sub->port, '-P', $new_sub->port,
+               $mode, '--check',
+       ],
like below and the extra comment can be removed:
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+# Create a disabled subscription.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+
+$old_sub->stop;
+
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+       [
+               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+               '-D', $new_sub->data_dir, '-b', $oldbindir,
+               '-B', $newbindir, '-s', $new_sub->host,
+               '-p', $old_sub->port, '-P', $new_sub->port,
+               $mode, '--check',
+       ],
2) This comment can be slightly changed:
+# Change configuration as well not to start the initial sync automatically
+$new_sub->append_conf('postgresql.conf',
+       "max_logical_replication_workers = 0");

to:
Change configuration so that initial table sync sync does not get
started automatically

3) The old comments were slightly better:
# Resume the initial sync and wait until all tables of subscription
# 'regress_sub5' are synchronized
$new_sub->append_conf('postgresql.conf',
"max_logical_replication_workers = 10");
$new_sub->restart;
$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 ENABLE");
$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');

Like:
# Enable the subscription
$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 ENABLE");

# Wait until all tables of subscription 'regress_sub5' are synchronized
$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');

Regards,
Vignesh

#199Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#198)
Re: pg_upgrade and logical replication

On Sat, Feb 17, 2024 at 10:05 AM vignesh C <vignesh21@gmail.com> wrote:

On Fri, 16 Feb 2024 at 10:50, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Thanks for the updated patch, Few suggestions:
1)  This can be moved to keep it similar to other tests:
+# Setup a disabled subscription. The upcoming test will check the
+# pg_createsubscriber won't work, so it is sufficient.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+       [
+               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+               '-D', $new_sub->data_dir, '-b', $oldbindir,
+               '-B', $newbindir, '-s', $new_sub->host,
+               '-p', $old_sub->port, '-P', $new_sub->port,
+               $mode, '--check',
+       ],
like below and the extra comment can be removed:
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+# Create a disabled subscription.

It is okay to adjust as you are suggesting but I find Kuroda-San's
comment better than just saying: "Create a disabled subscription." as
that explicitly tells why it is okay to create a disabled
subscription.

3) The old comments were slightly better:
# Resume the initial sync and wait until all tables of subscription
# 'regress_sub5' are synchronized
$new_sub->append_conf('postgresql.conf',
"max_logical_replication_workers = 10");
$new_sub->restart;
$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 ENABLE");
$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');

Like:
# Enable the subscription
$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 ENABLE");

# Wait until all tables of subscription 'regress_sub5' are synchronized
$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');

I would prefer Kuroda-San's version as his version of the comment
explains the intent of the test better whereas what you are saying is
just exactly what the next line of code is doing and is
self-explanatory.

--
With Regards,
Amit Kapila.

#200Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: vignesh C (#198)
1 attachment(s)
RE: pg_upgrade and logical replication

Dear Vignesh,

Thanks for reviewing! PSA new version.

Thanks for the updated patch, Few suggestions:
1)  This can be moved to keep it similar to other tests:
+# Setup a disabled subscription. The upcoming test will check the
+# pg_createsubscriber won't work, so it is sufficient.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+       [
+               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+               '-D', $new_sub->data_dir, '-b', $oldbindir,
+               '-B', $newbindir, '-s', $new_sub->host,
+               '-p', $old_sub->port, '-P', $new_sub->port,
+               $mode, '--check',
+       ],
like below and the extra comment can be removed:
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+# Create a disabled subscription.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+
+$old_sub->stop;
+
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+       [
+               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+               '-D', $new_sub->data_dir, '-b', $oldbindir,
+               '-B', $newbindir, '-s', $new_sub->host,
+               '-p', $old_sub->port, '-P', $new_sub->port,
+               $mode, '--check',
+       ],

Partially fixed. I moved the creation part to below but comments were kept.

2) This comment can be slightly changed:
+# Change configuration as well not to start the initial sync automatically
+$new_sub->append_conf('postgresql.conf',
+       "max_logical_replication_workers = 0");

to:
Change configuration so that initial table sync sync does not get
started automatically

Fixed.

3) The old comments were slightly better:
# Resume the initial sync and wait until all tables of subscription
# 'regress_sub5' are synchronized
$new_sub->append_conf('postgresql.conf',
"max_logical_replication_workers = 10");
$new_sub->restart;
$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5
ENABLE");
$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');

Like:
# Enable the subscription
$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5
ENABLE");

# Wait until all tables of subscription 'regress_sub5' are synchronized
$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');

Per comments from Amit [1]/messages/by-id/CAA4eK1Ls+RmJtTvOgaRXd+eHSY3x-KUE=sfEGQoU-JF_UzA62A@mail.gmail.com, I did not change.

[1]: /messages/by-id/CAA4eK1Ls+RmJtTvOgaRXd+eHSY3x-KUE=sfEGQoU-JF_UzA62A@mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Attachments:

v5-0001-Fix-testcase.patchapplication/octet-stream; name=v5-0001-Fix-testcase.patchDownload
From 1847dbd85904f93b66147bc8188a96cea4888354 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Wed, 14 Feb 2024 03:01:08 +0000
Subject: [PATCH v5] Fix testcase

---
 src/bin/pg_upgrade/t/004_subscription.pl | 360 ++++++++++++-----------
 1 file changed, 184 insertions(+), 176 deletions(-)

diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
index 63c0a98376..85d2cf21a4 100644
--- a/src/bin/pg_upgrade/t/004_subscription.pl
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -5,6 +5,7 @@ use strict;
 use warnings FATAL => 'all';
 
 use File::Find qw(find);
+use File::Path qw(rmtree);
 
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
@@ -34,37 +35,165 @@ my $newbindir = $new_sub->config_data('--bindir');
 # in it, like delete_old_cluster.{sh,bat}.
 chdir ${PostgreSQL::Test::Utils::tmp_check};
 
-# Initial setup
+# Remember a connection string toward the publisher node. It would be used
+# several times.
+my $connstr = $publisher->connstr . ' dbname=postgres';
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+# Setup a disabled subscription. The upcoming test will check the
+# pg_upgrade won't work, so it is sufficient.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+
+$old_sub->stop;
+
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode, '--check',
+	],
+	1,
+	[
+		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
+	],
+	[qr//],
+	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
+);
+
+# Reset max_replication_slots
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 10");
+
+# Cleanup
+$publisher->safe_psql('postgres', "DROP PUBLICATION regress_pub1");
+$old_sub->start;
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1;");
+
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run if:
+# a) there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) the subscription has no replication origin.
+# ------------------------------------------------------
 $publisher->safe_psql(
 	'postgres', qq[
-		CREATE TABLE tab_upgraded1(id int);
-		CREATE TABLE tab_upgraded2(id int);
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE PUBLICATION regress_pub2 FOR TABLE tab_primary_key;
 ]);
+
+# Insert the same value that is already present in publisher to the primary key
+# column of subscriber so that the table sync will fail.
 $old_sub->safe_psql(
 	'postgres', qq[
-		CREATE TABLE tab_upgraded1(id int);
-		CREATE TABLE tab_upgraded2(id int);
+		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
+		INSERT INTO tab_primary_key values(1);
+		CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2;
 ]);
 
-# Setup logical replication
-my $connstr = $publisher->connstr . ' dbname=postgres';
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
 
-# Setup an enabled subscription to verify that the running status and failover
-# option are retained after the upgrade.
-$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+# Setup another logical replication and drop the subscription's replication
+# origin.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub3");
+$old_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled = false)"
+);
+my $sub_oid = $old_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub3'");
+my $reporigin = 'pg_' . qq($sub_oid);
 $old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION regress_pub1 WITH (failover = true)"
+	"SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;
+
+command_fails(
+	[
+		'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+		'-D', $new_sub->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-s', $new_sub->host,
+		'-p', $old_sub->port, '-P', $new_sub->port,
+		$mode, '--check',
+	],
+	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
 );
-$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Verify the reason why the subscriber cannot be upgraded
+my $sub_relstate_filename;
+
+# Find a txt file that contains a list of tables that cannot be upgraded. We
+# cannot predict the file's path because the output directory contains a
+# milliseconds timestamp. File::Find::find must be used.
+find(
+	sub {
+		if ($File::Find::name =~ m/subs_invalid\.txt/)
+		{
+			$sub_relstate_filename = $File::Find::name;
+		}
+	},
+	$new_sub->data_dir . "/pg_upgrade_output.d");
+
+# Check the file content which should have tab_primary_key table in invalid
+# state.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub2\" schema:\"public\" relation:\"tab_primary_key\"/m,
+	'the previous test failed due to subscription table in invalid state');
+
+# Check the file content which should have regress_sub2 subscription.
+like(
+	slurp_file($sub_relstate_filename),
+	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub3\"/m,
+	'the previous test failed due to missing replication origin');
+
+# Cleanup
+$old_sub->start;
+$publisher->safe_psql(
+	'postgres', qq[
+		DROP PUBLICATION regress_pub2;
+		DROP PUBLICATION regress_pub3;
+		DROP TABLE tab_primary_key;
+]);
+$old_sub->safe_psql(
+	'postgres', qq[
+		DROP SUBSCRIPTION regress_sub2;
+		DROP SUBSCRIPTION regress_sub3;
+		DROP TABLE tab_primary_key;
+]);
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
 
 # Verify that the upgrade should be successful with tables in 'ready'/'init'
 # state along with retaining the replication origin's remote lsn, and
 # subscription's running status.
-$publisher->safe_psql('postgres',
-	"CREATE PUBLICATION regress_pub2 FOR TABLE tab_upgraded1");
-$old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub2 CONNECTION '$connstr' PUBLICATION regress_pub2"
-);
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE PUBLICATION regress_pub4 FOR TABLE tab_upgraded1;
+]);
+
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded1(id int);
+		CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub4 WITH (failover = true);
+]);
+
 # Wait till the table tab_upgraded1 reaches 'ready' state
 my $synced_query =
   "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
@@ -73,17 +202,24 @@ $old_sub->poll_query_until('postgres', $synced_query)
 
 $publisher->safe_psql('postgres',
 	"INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))");
-$publisher->wait_for_catchup('regress_sub2');
+$publisher->wait_for_catchup('regress_sub4');
 
 # Change configuration to prepare a subscription table in init state
 $old_sub->append_conf('postgresql.conf',
 	"max_logical_replication_workers = 0");
 $old_sub->restart;
 
-$publisher->safe_psql('postgres',
-	"ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
-$old_sub->safe_psql('postgres',
-	"ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+# Setup another logical replication
+$publisher->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded2(id int);
+		CREATE PUBLICATION regress_pub5 FOR TABLE tab_upgraded2;
+]);
+$old_sub->safe_psql(
+	'postgres', qq[
+		CREATE TABLE tab_upgraded2(id int);
+		CREATE SUBSCRIPTION regress_sub5 CONNECTION '$connstr' PUBLICATION regress_pub5;
+]);
 
 # The table tab_upgraded2 will be in init state as the subscriber
 # configuration for max_logical_replication_workers is set to 0.
@@ -93,10 +229,10 @@ is($result, qq(t), "Check that the table is in init state");
 
 # Get the replication origin's remote_lsn of the old subscriber
 my $remote_lsn = $old_sub->safe_psql('postgres',
-	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub2'"
+	"SELECT remote_lsn FROM pg_replication_origin_status os, pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub4'"
 );
 # Have the subscription in disabled state before upgrade
-$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 DISABLE");
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 DISABLE");
 
 my $tab_upgraded1_oid = $old_sub->safe_psql('postgres',
 	"SELECT oid FROM pg_class WHERE relname = 'tab_upgraded1'");
@@ -105,6 +241,11 @@ my $tab_upgraded2_oid = $old_sub->safe_psql('postgres',
 
 $old_sub->stop;
 
+# Change configuration so that initial table sync sync does not get started
+# automatically
+$new_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 0");
+
 # ------------------------------------------------------
 # Check that pg_upgrade is successful when all tables are in ready or in
 # init state (tab_upgraded1 table is in ready state and tab_upgraded2 table is
@@ -138,21 +279,19 @@ $publisher->safe_psql(
 $new_sub->start;
 
 # The subscription's running status and failover option should be preserved
-# in the upgraded instance. So regress_sub1 should still have subenabled and
-# subfailover set to true, while regress_sub2 should have both set to false.
-$result =
-  $new_sub->safe_psql('postgres',
-	"SELECT subname, subenabled, subfailover FROM pg_subscription ORDER BY subname");
-is( $result, qq(regress_sub1|t|t
-regress_sub2|f|f),
-	"check that the subscription's running status and failover are preserved");
-
-my $sub_oid = $new_sub->safe_psql('postgres',
-	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+# in the upgraded instance. So regress_sub4 should still have subenabled and
+# subfailover set to true, while regress_sub5 should have both set to false.
+$result = $new_sub->safe_psql('postgres',
+	"SELECT subname, subenabled, subfailover FROM pg_subscription ORDER BY subname"
+);
+is( $result, qq(regress_sub4|t|t
+regress_sub5|f|f),
+	"check that the subscription's running status and failover are preserved"
+);
 
 # Subscription relations should be preserved
 $result = $new_sub->safe_psql('postgres',
-	"SELECT srrelid, srsubstate FROM pg_subscription_rel WHERE srsubid = $sub_oid ORDER BY srrelid"
+	"SELECT srrelid, srsubstate FROM pg_subscription_rel ORDER BY srrelid"
 );
 is( $result, qq($tab_upgraded1_oid|r
 $tab_upgraded2_oid|i),
@@ -160,16 +299,20 @@ $tab_upgraded2_oid|i),
 );
 
 # The replication origin's remote_lsn should be preserved
+$sub_oid = $new_sub->safe_psql('postgres',
+	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
 $result = $new_sub->safe_psql('postgres',
 	"SELECT remote_lsn FROM pg_replication_origin_status WHERE external_id = 'pg_' || $sub_oid"
 );
 is($result, qq($remote_lsn), "remote_lsn should have been preserved");
 
-# Enable the subscription
-$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
-
-# Wait until all tables of subscription 'regress_sub2' are synchronized
-$new_sub->wait_for_subscription_sync($publisher, 'regress_sub2');
+# Resume the initial sync and wait until all tables of subscription
+# 'regress_sub5' are synchronized
+$new_sub->append_conf('postgresql.conf',
+	"max_logical_replication_workers = 10");
+$new_sub->restart;
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 ENABLE");
+$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');
 
 # Rows on tab_upgraded1 and tab_upgraded2 should have been replicated
 $result =
@@ -181,139 +324,4 @@ is($result, qq(1),
 	"check the data is synced after enabling the subscription for the table that was in init state"
 );
 
-# cleanup
-$new_sub->stop;
-$old_sub->append_conf('postgresql.conf',
-	"max_logical_replication_workers = 4");
-$old_sub->start;
-$old_sub->safe_psql(
-	'postgres', qq[
-		ALTER SUBSCRIPTION regress_sub1 DISABLE;
-		ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none);
-		DROP SUBSCRIPTION regress_sub1;
-]);
-$old_sub->stop;
-
-# ------------------------------------------------------
-# Check that pg_upgrade fails when max_replication_slots configured in the new
-# cluster is less than the number of subscriptions in the old cluster.
-# ------------------------------------------------------
-my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
-$new_sub1->init;
-$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
-
-# pg_upgrade will fail because the new cluster has insufficient
-# max_replication_slots.
-command_checks_all(
-	[
-		'pg_upgrade', '--no-sync',
-		'-d', $old_sub->data_dir,
-		'-D', $new_sub1->data_dir,
-		'-b', $oldbindir,
-		'-B', $newbindir,
-		'-s', $new_sub1->host,
-		'-p', $old_sub->port,
-		'-P', $new_sub1->port,
-		$mode, '--check',
-	],
-	1,
-	[
-		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
-	],
-	[qr//],
-	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
-);
-
-# Reset max_replication_slots
-$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 10");
-
-# Drop the subscription
-$old_sub->start;
-$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");
-
-# ------------------------------------------------------
-# Check that pg_upgrade refuses to run if:
-# a) there's a subscription with tables in a state other than 'r' (ready) or
-#    'i' (init) and/or
-# b) the subscription has no replication origin.
-# ------------------------------------------------------
-$publisher->safe_psql(
-	'postgres', qq[
-		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
-		INSERT INTO tab_primary_key values(1);
-		CREATE PUBLICATION regress_pub3 FOR TABLE tab_primary_key;
-]);
-
-# Insert the same value that is already present in publisher to the primary key
-# column of subscriber so that the table sync will fail.
-$old_sub->safe_psql(
-	'postgres', qq[
-		CREATE TABLE tab_primary_key(id serial PRIMARY KEY);
-		INSERT INTO tab_primary_key values(1);
-		CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr' PUBLICATION regress_pub3;
-]);
-
-# Table will be in 'd' (data is being copied) state as table sync will fail
-# because of primary key constraint error.
-my $started_query =
-  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
-$old_sub->poll_query_until('postgres', $started_query)
-  or die
-  "Timed out while waiting for the table state to become 'd' (datasync)";
-
-# Create another subscription and drop the subscription's replication origin
-$old_sub->safe_psql('postgres',
-	"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION regress_pub3 WITH (enabled = false)"
-);
-$sub_oid = $old_sub->safe_psql('postgres',
-	"SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
-my $reporigin = 'pg_' . qq($sub_oid);
-$old_sub->safe_psql('postgres',
-	"SELECT pg_replication_origin_drop('$reporigin')");
-
-$old_sub->stop;
-
-command_fails(
-	[
-		'pg_upgrade', '--no-sync',
-		'-d', $old_sub->data_dir,
-		'-D', $new_sub1->data_dir,
-		'-b', $oldbindir,
-		'-B', $newbindir,
-		'-s', $new_sub1->host,
-		'-p', $old_sub->port,
-		'-P', $new_sub1->port,
-		$mode, '--check',
-	],
-	'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state and missing replication origin'
-);
-
-# Verify the reason why the subscriber cannot be upgraded
-my $sub_relstate_filename;
-
-# Find a txt file that contains a list of tables that cannot be upgraded. We
-# cannot predict the file's path because the output directory contains a
-# milliseconds timestamp. File::Find::find must be used.
-find(
-	sub {
-		if ($File::Find::name =~ m/subs_invalid\.txt/)
-		{
-			$sub_relstate_filename = $File::Find::name;
-		}
-	},
-	$new_sub1->data_dir . "/pg_upgrade_output.d");
-
-# Check the file content which should have tab_primary_key table in invalid
-# state.
-like(
-	slurp_file($sub_relstate_filename),
-	qr/The table sync state \"d\" is not allowed for database:\"postgres\" subscription:\"regress_sub3\" schema:\"public\" relation:\"tab_primary_key\"/m,
-	'the previous test failed due to subscription table in invalid state');
-
-# Check the file content which should have regress_sub4 subscription.
-like(
-	slurp_file($sub_relstate_filename),
-	qr/The replication origin is missing for database:\"postgres\" subscription:\"regress_sub4\"/m,
-	'the previous test failed due to missing replication origin');
-
 done_testing();
-- 
2.43.0

#201vignesh C
vignesh21@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#200)
Re: pg_upgrade and logical replication

On Mon, 19 Feb 2024 at 06:54, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Vignesh,

Thanks for reviewing! PSA new version.

Thanks for the updated patch, Few suggestions:
1)  This can be moved to keep it similar to other tests:
+# Setup a disabled subscription. The upcoming test will check the
+# pg_createsubscriber won't work, so it is sufficient.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+       [
+               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+               '-D', $new_sub->data_dir, '-b', $oldbindir,
+               '-B', $newbindir, '-s', $new_sub->host,
+               '-p', $old_sub->port, '-P', $new_sub->port,
+               $mode, '--check',
+       ],
like below and the extra comment can be removed:
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+# Create a disabled subscription.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+
+$old_sub->stop;
+
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+       [
+               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+               '-D', $new_sub->data_dir, '-b', $oldbindir,
+               '-B', $newbindir, '-s', $new_sub->host,
+               '-p', $old_sub->port, '-P', $new_sub->port,
+               $mode, '--check',
+       ],

Partially fixed. I moved the creation part to below but comments were kept.

2) This comment can be slightly changed:
+# Change configuration as well not to start the initial sync automatically
+$new_sub->append_conf('postgresql.conf',
+       "max_logical_replication_workers = 0");

to:
Change configuration so that initial table sync sync does not get
started automatically

Fixed.

3) The old comments were slightly better:
# Resume the initial sync and wait until all tables of subscription
# 'regress_sub5' are synchronized
$new_sub->append_conf('postgresql.conf',
"max_logical_replication_workers = 10");
$new_sub->restart;
$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5
ENABLE");
$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');

Like:
# Enable the subscription
$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5
ENABLE");

# Wait until all tables of subscription 'regress_sub5' are synchronized
$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');

Per comments from Amit [1], I did not change.

Thanks for the updated patch, I don't have any more comments.

Regards,
Vignesh

#202Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#200)
Re: pg_upgrade and logical replication

On Mon, Feb 19, 2024 at 6:54 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Thanks for reviewing! PSA new version.

Pushed this after making minor changes in the comments.

--
With Regards,
Amit Kapila.

#203vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#202)
1 attachment(s)
Re: pg_upgrade and logical replication

On Mon, 19 Feb 2024 at 12:38, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Feb 19, 2024 at 6:54 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Thanks for reviewing! PSA new version.

Pushed this after making minor changes in the comments.

Recently there was a failure in 004_subscription tap test at [1]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&amp;dt=2024-03-26%2004%3A23%3A13.
In this failure, the tab_upgraded1 table was expected to have 51
records but has only 50 records. Before the upgrade both publisher and
subscriber have 50 records.
After the upgrade we have inserted one record in the publisher, now
tab_upgraded1 will have 51 records in the publisher. Then we start the
subscriber after changing max_logical_replication_workers so that
apply workers get started and apply the changes received. After
starting we enable regress_sub5, wait for sync of regress_sub5
subscription and check for tab_upgraded1 and tab_upgraded2 table data.
In a few random cases the one record that was inserted into
tab_upgraded1 table will not get replicated as we have not waited for
regress_sub4 subscription to apply the changes from the publisher.
The attached patch has changes to wait for regress_sub4 subscription
to apply the changes from the publisher before verifying the data.

[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&amp;dt=2024-03-26%2004%3A23%3A13

Regards,
Vignesh

Attachments:

v1-0001-Fix-random-upgrade-failure-test-in-004_subscripti.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Fix-random-upgrade-failure-test-in-004_subscripti.patchDownload
From 69d7adc27198d5cf3ecf5b8c7d62a3e41263bc2a Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Wed, 27 Mar 2024 11:27:34 +0530
Subject: [PATCH v1] Fix random upgrade failure test in 004_subscription tap
 test.

The test was failing because the incremental changes had not been replicated to
the subscriber after subscriber was started. The changes were not
replicated as we were checking the table data immediately after the
server was started. This test did not wait for regress_sub4 subscription to
apply the changes for tab_upgraded1 table. Fixed it by waiting until the
subscription was synced to get all the changes for tab_upgraded1 table data.
---
 src/bin/pg_upgrade/t/004_subscription.pl | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
index df5d6dffbc..06504e70c1 100644
--- a/src/bin/pg_upgrade/t/004_subscription.pl
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -306,12 +306,14 @@ $result = $new_sub->safe_psql('postgres',
 );
 is($result, qq($remote_lsn), "remote_lsn should have been preserved");
 
-# Resume the initial sync and wait until all tables of subscription
-# 'regress_sub5' are synchronized
+# Resume the initial sync
 $new_sub->append_conf('postgresql.conf',
 	"max_logical_replication_workers = 10");
 $new_sub->restart;
 $new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 ENABLE");
+
+# Wait until both subscriptions catch up the changes on the publisher
+$publisher->wait_for_catchup('regress_sub4');
 $new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');
 
 # Rows on tab_upgraded1 and tab_upgraded2 should have been replicated
-- 
2.34.1

#204Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: vignesh C (#203)
RE: pg_upgrade and logical replication

Dear Vignesh,

Recently there was a failure in 004_subscription tap test at [1].
In this failure, the tab_upgraded1 table was expected to have 51
records but has only 50 records. Before the upgrade both publisher and
subscriber have 50 records.

Good catch!

After the upgrade we have inserted one record in the publisher, now
tab_upgraded1 will have 51 records in the publisher. Then we start the
subscriber after changing max_logical_replication_workers so that
apply workers get started and apply the changes received. After
starting we enable regress_sub5, wait for sync of regress_sub5
subscription and check for tab_upgraded1 and tab_upgraded2 table data.
In a few random cases the one record that was inserted into
tab_upgraded1 table will not get replicated as we have not waited for
regress_sub4 subscription to apply the changes from the publisher.
The attached patch has changes to wait for regress_sub4 subscription
to apply the changes from the publisher before verifying the data.

[1] -
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&amp;dt=2024-03-
26%2004%3A23%3A13

Yeah, I think it is an oversight in f17529. Previously subscriptions which
receiving changes were confirmed to be caught up, I missed to add the line while
restructuring the script. +1 for your fix.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

#205Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#203)
Re: pg_upgrade and logical replication

On Wed, Mar 27, 2024 at 11:57 AM vignesh C <vignesh21@gmail.com> wrote:

The attached patch has changes to wait for regress_sub4 subscription
to apply the changes from the publisher before verifying the data.

Pushed after changing the order of wait as it looks logical to wait
for regress_sub5 after enabling the subscription. Thanks

--
With Regards,
Amit Kapila.

#206Nathan Bossart
nathandbossart@gmail.com
In reply to: Amit Kapila (#205)
Re: pg_upgrade and logical replication

I've been looking into optimizing pg_upgrade's once-in-each-database steps
[0]: https://commitfest.postgresql.org/48/4995/
the cluster and running a query like

SELECT count(*) FROM pg_catalog.pg_subscription WHERE subdbid = %d;

Then, later on, we combine all of these values in
count_old_cluster_subscriptions() to verify that max_replication_slots is
set high enough. AFAICT these per-database subscription counts aren't used
for anything else.

This is an extremely expensive way to perform that check, and so I'm
wondering why we don't just do

SELECT count(*) FROM pg_catalog.pg_subscription;

once in count_old_cluster_subscriptions().

[0]: https://commitfest.postgresql.org/48/4995/

--
nathan

#207Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#206)
1 attachment(s)
Re: pg_upgrade and logical replication

On Fri, Jul 19, 2024 at 03:44:22PM -0500, Nathan Bossart wrote:

I've been looking into optimizing pg_upgrade's once-in-each-database steps
[0], and I noticed that we are opening a connection to every database in
the cluster and running a query like

SELECT count(*) FROM pg_catalog.pg_subscription WHERE subdbid = %d;

Then, later on, we combine all of these values in
count_old_cluster_subscriptions() to verify that max_replication_slots is
set high enough. AFAICT these per-database subscription counts aren't used
for anything else.

This is an extremely expensive way to perform that check, and so I'm
wondering why we don't just do

SELECT count(*) FROM pg_catalog.pg_subscription;

once in count_old_cluster_subscriptions().

Like so...

--
nathan

Attachments:

v1-0001-pg_upgrade-retrieve-subscription-count-more-effic.patchtext/plain; charset=us-asciiDownload
From a90db3fa008f73f7f97cc73e4453ebf2e981cb83 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 20 Jul 2024 21:01:29 -0500
Subject: [PATCH v1 1/1] pg_upgrade: retrieve subscription count more
 efficiently

---
 src/bin/pg_upgrade/check.c      |  9 +++-----
 src/bin/pg_upgrade/info.c       | 39 ++++++++-------------------------
 src/bin/pg_upgrade/pg_upgrade.h |  3 +--
 3 files changed, 13 insertions(+), 38 deletions(-)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 27924159d6..39d14b7b92 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -1797,17 +1797,14 @@ check_new_cluster_subscription_configuration(void)
 {
 	PGresult   *res;
 	PGconn	   *conn;
-	int			nsubs_on_old;
 	int			max_replication_slots;
 
 	/* Subscriptions and their dependencies can be migrated since PG17. */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
 		return;
 
-	nsubs_on_old = count_old_cluster_subscriptions();
-
 	/* Quick return if there are no subscriptions to be migrated. */
-	if (nsubs_on_old == 0)
+	if (old_cluster.nsubs == 0)
 		return;
 
 	prep_status("Checking for new cluster configuration for subscriptions");
@@ -1821,10 +1818,10 @@ check_new_cluster_subscription_configuration(void)
 		pg_fatal("could not determine parameter settings on new cluster");
 
 	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
-	if (nsubs_on_old > max_replication_slots)
+	if (old_cluster.nsubs > max_replication_slots)
 		pg_fatal("\"max_replication_slots\" (%d) must be greater than or equal to the number of "
 				 "subscriptions (%d) on the old cluster",
-				 max_replication_slots, nsubs_on_old);
+				 max_replication_slots, old_cluster.nsubs);
 
 	PQclear(res);
 	PQfinish(conn);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 95c22a7200..78398ec58b 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,7 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
-static void get_db_subscription_count(DbInfo *dbinfo);
+static void get_subscription_count(ClusterInfo *cluster);
 
 
 /*
@@ -298,12 +298,12 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 		 * count for the old cluster.
 		 */
 		if (cluster == &old_cluster)
-		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
-			get_db_subscription_count(pDbInfo);
-		}
 	}
 
+	if (cluster == &old_cluster)
+		get_subscription_count(cluster);
+
 	if (cluster == &old_cluster)
 		pg_log(PG_VERBOSE, "\nsource databases:");
 	else
@@ -750,14 +750,14 @@ count_old_cluster_logical_slots(void)
 /*
  * get_db_subscription_count()
  *
- * Gets the number of subscriptions in the database referred to by "dbinfo".
+ * Gets the number of subscriptions in the cluster.
  *
  * Note: This function will not do anything if the old cluster is pre-PG17.
  * This is because before that the logical slots are not upgraded, so we will
  * not be able to upgrade the logical replication clusters completely.
  */
 static void
-get_db_subscription_count(DbInfo *dbinfo)
+get_subscription_count(ClusterInfo *cluster)
 {
 	PGconn	   *conn;
 	PGresult   *res;
@@ -766,36 +766,15 @@ get_db_subscription_count(DbInfo *dbinfo)
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
 		return;
 
-	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	conn = connectToServer(&old_cluster, "template1");
 	res = executeQueryOrDie(conn, "SELECT count(*) "
-							"FROM pg_catalog.pg_subscription WHERE subdbid = %u",
-							dbinfo->db_oid);
-	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+							"FROM pg_catalog.pg_subscription");
+	cluster->nsubs = atoi(PQgetvalue(res, 0, 0));
 
 	PQclear(res);
 	PQfinish(conn);
 }
 
-/*
- * count_old_cluster_subscriptions()
- *
- * Returns the number of subscriptions for all databases.
- *
- * Note: this function always returns 0 if the old_cluster is PG16 and prior
- * because we gather subscriptions only for cluster versions greater than or
- * equal to PG17. See get_db_subscription_count().
- */
-int
-count_old_cluster_subscriptions(void)
-{
-	int			nsubs = 0;
-
-	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
-		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
-
-	return nsubs;
-}
-
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 8afe240bdf..e6dbbe6a93 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -197,7 +197,6 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
-	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -296,6 +295,7 @@ typedef struct
 	char		major_version_str[64];	/* string PG_VERSION of cluster */
 	uint32		bin_version;	/* version returned from pg_ctl */
 	const char *tablespace_suffix;	/* directory specification */
+	int			nsubs;			/* number of subscriptions */
 } ClusterInfo;
 
 
@@ -430,7 +430,6 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
-int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
-- 
2.39.3 (Apple Git-146)

#208Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#207)
Re: pg_upgrade and logical replication

On Sat, Jul 20, 2024 at 09:03:07PM -0500, Nathan Bossart wrote:

This is an extremely expensive way to perform that check, and so I'm
wondering why we don't just do

SELECT count(*) FROM pg_catalog.pg_subscription;

once in count_old_cluster_subscriptions().

Like so...

Ah, good catch. That sounds like a good thing to do because we don't
care about the number of subscriptions for each database in the
current code.

This is something that qualifies as an open item, IMO, as this code
is new to PG17.

A comment in get_db_rel_and_slot_infos() becomes incorrect where
get_old_cluster_logical_slot_infos() is called; it is still referring
to the subscription count.

Actually, on the same grounds, couldn't we do the logical slot info
retrieval in get_old_cluster_logical_slot_infos() in a single pass as
well? pg_replication_slots reports some information about all the
slots, and the current code has a qual on current_database(). It
looks to me that this could be replaced by a single query, ordering
the slots by database names, assigning the slot infos in each
database's DbInfo at the end. That would be much more efficient if
dealing with a lot of databases.
--
Michael

#209Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#208)
Re: pg_upgrade and logical replication

On Mon, Jul 22, 2024 at 7:35 AM Michael Paquier <michael@paquier.xyz> wrote:

On Sat, Jul 20, 2024 at 09:03:07PM -0500, Nathan Bossart wrote:

This is an extremely expensive way to perform that check, and so I'm
wondering why we don't just do

SELECT count(*) FROM pg_catalog.pg_subscription;

once in count_old_cluster_subscriptions().

Like so...

Isn't it better to directly invoke get_subscription_count() in
check_new_cluster_subscription_configuration() where it is required
rather than in a db-specific general function?

Ah, good catch. That sounds like a good thing to do because we don't
care about the number of subscriptions for each database in the
current code.

This is something that qualifies as an open item, IMO, as this code
is new to PG17.

A comment in get_db_rel_and_slot_infos() becomes incorrect where
get_old_cluster_logical_slot_infos() is called; it is still referring
to the subscription count.

Actually, on the same grounds, couldn't we do the logical slot info
retrieval in get_old_cluster_logical_slot_infos() in a single pass as
well? pg_replication_slots reports some information about all the
slots, and the current code has a qual on current_database(). It
looks to me that this could be replaced by a single query, ordering
the slots by database names, assigning the slot infos in each
database's DbInfo at the end.

Unlike subscriptions, logical slots are database-specific objects. We
have some checks in the code like the one in CreateDecodingContext()
for MyDatabaseId which may or may not create a problem for this case
as we don't consume changes when checking
LogicalReplicationSlotHasPendingWal via
binary_upgrade_logical_slot_has_caught_up() but I think this needs
more analysis than what Nathan has proposed. So, I suggest taking up
this task for PG18 if we want to optimize this code path.

--
With Regards,
Amit Kapila.

#210Nathan Bossart
nathandbossart@gmail.com
In reply to: Amit Kapila (#209)
1 attachment(s)
Re: pg_upgrade and logical replication

On Mon, Jul 22, 2024 at 03:45:19PM +0530, Amit Kapila wrote:

On Mon, Jul 22, 2024 at 7:35 AM Michael Paquier <michael@paquier.xyz> wrote:

On Sat, Jul 20, 2024 at 09:03:07PM -0500, Nathan Bossart wrote:

This is an extremely expensive way to perform that check, and so I'm
wondering why we don't just do

SELECT count(*) FROM pg_catalog.pg_subscription;

once in count_old_cluster_subscriptions().

Like so...

Isn't it better to directly invoke get_subscription_count() in
check_new_cluster_subscription_configuration() where it is required
rather than in a db-specific general function?

IIUC the old cluster won't be running at that point.

Ah, good catch. That sounds like a good thing to do because we don't
care about the number of subscriptions for each database in the
current code.

This is something that qualifies as an open item, IMO, as this code
is new to PG17.

+1

A comment in get_db_rel_and_slot_infos() becomes incorrect where
get_old_cluster_logical_slot_infos() is called; it is still referring
to the subscription count.

I removed this comment since IMHO it doesn't add much.

Actually, on the same grounds, couldn't we do the logical slot info
retrieval in get_old_cluster_logical_slot_infos() in a single pass as
well? pg_replication_slots reports some information about all the
slots, and the current code has a qual on current_database(). It
looks to me that this could be replaced by a single query, ordering
the slots by database names, assigning the slot infos in each
database's DbInfo at the end.

Unlike subscriptions, logical slots are database-specific objects. We
have some checks in the code like the one in CreateDecodingContext()
for MyDatabaseId which may or may not create a problem for this case
as we don't consume changes when checking
LogicalReplicationSlotHasPendingWal via
binary_upgrade_logical_slot_has_caught_up() but I think this needs
more analysis than what Nathan has proposed. So, I suggest taking up
this task for PG18 if we want to optimize this code path.

I see what you mean.

--
nathan

Attachments:

v2-0001-pg_upgrade-retrieve-subscription-count-more-effic.patchtext/plain; charset=us-asciiDownload
From b3c3c6533e7f221dc869e613e78802a3054d42b3 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 20 Jul 2024 21:01:29 -0500
Subject: [PATCH v2 1/1] pg_upgrade: retrieve subscription count more
 efficiently

---
 src/bin/pg_upgrade/check.c      |  9 +++----
 src/bin/pg_upgrade/info.c       | 43 +++++++--------------------------
 src/bin/pg_upgrade/pg_upgrade.h |  3 +--
 3 files changed, 13 insertions(+), 42 deletions(-)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 27924159d6..39d14b7b92 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -1797,17 +1797,14 @@ check_new_cluster_subscription_configuration(void)
 {
 	PGresult   *res;
 	PGconn	   *conn;
-	int			nsubs_on_old;
 	int			max_replication_slots;
 
 	/* Subscriptions and their dependencies can be migrated since PG17. */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
 		return;
 
-	nsubs_on_old = count_old_cluster_subscriptions();
-
 	/* Quick return if there are no subscriptions to be migrated. */
-	if (nsubs_on_old == 0)
+	if (old_cluster.nsubs == 0)
 		return;
 
 	prep_status("Checking for new cluster configuration for subscriptions");
@@ -1821,10 +1818,10 @@ check_new_cluster_subscription_configuration(void)
 		pg_fatal("could not determine parameter settings on new cluster");
 
 	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
-	if (nsubs_on_old > max_replication_slots)
+	if (old_cluster.nsubs > max_replication_slots)
 		pg_fatal("\"max_replication_slots\" (%d) must be greater than or equal to the number of "
 				 "subscriptions (%d) on the old cluster",
-				 max_replication_slots, nsubs_on_old);
+				 max_replication_slots, old_cluster.nsubs);
 
 	PQclear(res);
 	PQfinish(conn);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 95c22a7200..e43be79aa5 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,7 +28,7 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
-static void get_db_subscription_count(DbInfo *dbinfo);
+static void get_subscription_count(ClusterInfo *cluster);
 
 
 /*
@@ -293,17 +293,13 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 
 		get_rel_infos(cluster, pDbInfo);
 
-		/*
-		 * Retrieve the logical replication slots infos and the subscriptions
-		 * count for the old cluster.
-		 */
 		if (cluster == &old_cluster)
-		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
-			get_db_subscription_count(pDbInfo);
-		}
 	}
 
+	if (cluster == &old_cluster)
+		get_subscription_count(cluster);
+
 	if (cluster == &old_cluster)
 		pg_log(PG_VERBOSE, "\nsource databases:");
 	else
@@ -750,14 +746,14 @@ count_old_cluster_logical_slots(void)
 /*
  * get_db_subscription_count()
  *
- * Gets the number of subscriptions in the database referred to by "dbinfo".
+ * Gets the number of subscriptions in the cluster.
  *
  * Note: This function will not do anything if the old cluster is pre-PG17.
  * This is because before that the logical slots are not upgraded, so we will
  * not be able to upgrade the logical replication clusters completely.
  */
 static void
-get_db_subscription_count(DbInfo *dbinfo)
+get_subscription_count(ClusterInfo *cluster)
 {
 	PGconn	   *conn;
 	PGresult   *res;
@@ -766,36 +762,15 @@ get_db_subscription_count(DbInfo *dbinfo)
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
 		return;
 
-	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	conn = connectToServer(&old_cluster, "template1");
 	res = executeQueryOrDie(conn, "SELECT count(*) "
-							"FROM pg_catalog.pg_subscription WHERE subdbid = %u",
-							dbinfo->db_oid);
-	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+							"FROM pg_catalog.pg_subscription");
+	cluster->nsubs = atoi(PQgetvalue(res, 0, 0));
 
 	PQclear(res);
 	PQfinish(conn);
 }
 
-/*
- * count_old_cluster_subscriptions()
- *
- * Returns the number of subscriptions for all databases.
- *
- * Note: this function always returns 0 if the old_cluster is PG16 and prior
- * because we gather subscriptions only for cluster versions greater than or
- * equal to PG17. See get_db_subscription_count().
- */
-int
-count_old_cluster_subscriptions(void)
-{
-	int			nsubs = 0;
-
-	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
-		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
-
-	return nsubs;
-}
-
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 8afe240bdf..e6dbbe6a93 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -197,7 +197,6 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
-	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -296,6 +295,7 @@ typedef struct
 	char		major_version_str[64];	/* string PG_VERSION of cluster */
 	uint32		bin_version;	/* version returned from pg_ctl */
 	const char *tablespace_suffix;	/* directory specification */
+	int			nsubs;			/* number of subscriptions */
 } ClusterInfo;
 
 
@@ -430,7 +430,6 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
-int			count_old_cluster_subscriptions(void);
 
 /* option.c */
 
-- 
2.39.3 (Apple Git-146)

#211Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#210)
Re: pg_upgrade and logical replication

On Mon, Jul 22, 2024 at 09:46:29AM -0500, Nathan Bossart wrote:

On Mon, Jul 22, 2024 at 03:45:19PM +0530, Amit Kapila wrote:

On Mon, Jul 22, 2024 at 7:35 AM Michael Paquier <michael@paquier.xyz> wrote:

A comment in get_db_rel_and_slot_infos() becomes incorrect where
get_old_cluster_logical_slot_infos() is called; it is still referring
to the subscription count.

I removed this comment since IMHO it doesn't add much.

WFM.

Actually, on the same grounds, couldn't we do the logical slot info
retrieval in get_old_cluster_logical_slot_infos() in a single pass as
well? pg_replication_slots reports some information about all the
slots, and the current code has a qual on current_database(). It
looks to me that this could be replaced by a single query, ordering
the slots by database names, assigning the slot infos in each
database's DbInfo at the end.

Unlike subscriptions, logical slots are database-specific objects. We
have some checks in the code like the one in CreateDecodingContext()
for MyDatabaseId which may or may not create a problem for this case
as we don't consume changes when checking
LogicalReplicationSlotHasPendingWal via
binary_upgrade_logical_slot_has_caught_up() but I think this needs
more analysis than what Nathan has proposed. So, I suggest taking up
this task for PG18 if we want to optimize this code path.

I see what you mean.

I am not sure to get the reason why get_old_cluster_logical_slot_infos()
could not be optimized, TBH. LogicalReplicationSlotHasPendingWal()
uses the fast forward mode where no changes are generated, hence there
should be no need for a dependency to a connection to a specific
database :)

Combined to a hash table based on the database name and/or OID to know
to which dbinfo to attach the information of a slot, then it should be
possible to use one query, making the slot info gathering closer to
O(N) rather than the current O(N^2).
--
Michael

#212Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#211)
Re: pg_upgrade and logical replication

On Tue, Jul 23, 2024 at 4:33 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jul 22, 2024 at 09:46:29AM -0500, Nathan Bossart wrote:

On Mon, Jul 22, 2024 at 03:45:19PM +0530, Amit Kapila wrote:

Unlike subscriptions, logical slots are database-specific objects. We
have some checks in the code like the one in CreateDecodingContext()
for MyDatabaseId which may or may not create a problem for this case
as we don't consume changes when checking
LogicalReplicationSlotHasPendingWal via
binary_upgrade_logical_slot_has_caught_up() but I think this needs
more analysis than what Nathan has proposed. So, I suggest taking up
this task for PG18 if we want to optimize this code path.

I see what you mean.

I am not sure to get the reason why get_old_cluster_logical_slot_infos()
could not be optimized, TBH. LogicalReplicationSlotHasPendingWal()
uses the fast forward mode where no changes are generated, hence there
should be no need for a dependency to a connection to a specific
database :)

Combined to a hash table based on the database name and/or OID to know
to which dbinfo to attach the information of a slot, then it should be
possible to use one query, making the slot info gathering closer to
O(N) rather than the current O(N^2).

The point is that unlike subscriptions logical slots are not
cluster-level objects. So, this needs more careful design decisions
rather than a fix-up patch for PG-17. One more thing after collecting
slot-level, we also want to consider the creation of slots which again
are created at per-database level.

--
With Regards,
Amit Kapila.

#213Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Amit Kapila (#212)
RE: pg_upgrade and logical replication

Dear Amit, Michael,

I am not sure to get the reason why get_old_cluster_logical_slot_infos()
could not be optimized, TBH. LogicalReplicationSlotHasPendingWal()
uses the fast forward mode where no changes are generated, hence there
should be no need for a dependency to a connection to a specific
database :)

Combined to a hash table based on the database name and/or OID to know
to which dbinfo to attach the information of a slot, then it should be
possible to use one query, making the slot info gathering closer to
O(N) rather than the current O(N^2).

The point is that unlike subscriptions logical slots are not
cluster-level objects. So, this needs more careful design decisions
rather than a fix-up patch for PG-17. One more thing after collecting
slot-level, we also want to consider the creation of slots which again
are created at per-database level.

I also considered the combination with the optimization (parallelization) of
pg_upgrade [1]/messages/by-id/20240516211638.GA1688936@nathanxps13. IIUC, the patch tries to connect to some databases in parallel
and run commands. The current style of create_logical_replication_slots() can be
easily adapted because tasks are divided per database.

However, if we change like get_old_cluster_logical_slot_infos() to do in a single
pass, we may have to shift LogicalSlotInfoArr to cluster-wide data and store the
database name in LogicalSlotInfo. Also, in create_logical_replication_slots(),
we may have to check the located database for every slot and connect to the
appropriate database. These changes make it difficult to parallelize the operation.

[1]: /messages/by-id/20240516211638.GA1688936@nathanxps13

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#214Amit Kapila
amit.kapila16@gmail.com
In reply to: Nathan Bossart (#210)
Re: pg_upgrade and logical replication

On Mon, Jul 22, 2024 at 8:16 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Mon, Jul 22, 2024 at 03:45:19PM +0530, Amit Kapila wrote:

On Mon, Jul 22, 2024 at 7:35 AM Michael Paquier <michael@paquier.xyz> wrote:

On Sat, Jul 20, 2024 at 09:03:07PM -0500, Nathan Bossart wrote:

This is an extremely expensive way to perform that check, and so I'm
wondering why we don't just do

SELECT count(*) FROM pg_catalog.pg_subscription;

once in count_old_cluster_subscriptions().

Like so...

Isn't it better to directly invoke get_subscription_count() in
check_new_cluster_subscription_configuration() where it is required
rather than in a db-specific general function?

IIUC the old cluster won't be running at that point.

Right, the other option would be to move it to the place where we call
check_old_cluster_for_valid_slots(), etc. Initially, it was kept in
the specific function (get_db_rel_and_slot_infos) as we were
mainlining the count at the per-database level but now as we are
changing that I am not sure if calling it from the same place is a
good idea. But OTOH, it is okay to keep it at the place where we
retrieve the required information from the old cluster.

One minor point is the comment atop get_subscription_count() still
refers to the function name as get_db_subscription_count().

--
With Regards,
Amit Kapila.

#215Nathan Bossart
nathandbossart@gmail.com
In reply to: Amit Kapila (#214)
1 attachment(s)
Re: pg_upgrade and logical replication

On Tue, Jul 23, 2024 at 09:05:05AM +0530, Amit Kapila wrote:

Right, the other option would be to move it to the place where we call
check_old_cluster_for_valid_slots(), etc. Initially, it was kept in
the specific function (get_db_rel_and_slot_infos) as we were
mainlining the count at the per-database level but now as we are
changing that I am not sure if calling it from the same place is a
good idea. But OTOH, it is okay to keep it at the place where we
retrieve the required information from the old cluster.

I moved it to where you suggested.

One minor point is the comment atop get_subscription_count() still
refers to the function name as get_db_subscription_count().

Oops, fixed.

--
nathan

Attachments:

v3-0001-pg_upgrade-retrieve-subscription-count-more-effic.patchtext/plain; charset=us-asciiDownload
From 19831c5a2869f949e73564abea8a36858b39bcd1 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 20 Jul 2024 21:01:29 -0500
Subject: [PATCH v3 1/1] pg_upgrade: retrieve subscription count more
 efficiently

---
 src/bin/pg_upgrade/check.c      | 13 ++++-----
 src/bin/pg_upgrade/info.c       | 51 +++++----------------------------
 src/bin/pg_upgrade/pg_upgrade.h |  4 +--
 3 files changed, 15 insertions(+), 53 deletions(-)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 27924159d6..51e30a2f23 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -609,8 +609,10 @@ check_and_dump_old_cluster(bool live_check)
 
 		/*
 		 * Subscriptions and their dependencies can be migrated since PG17.
-		 * See comments atop get_db_subscription_count().
+		 * Before that the logical slots are not upgraded, so we will not be
+		 * able to upgrade the logical replication clusters completely.
 		 */
+		get_subscription_count(&old_cluster);
 		check_old_cluster_subscription_state();
 	}
 
@@ -1797,17 +1799,14 @@ check_new_cluster_subscription_configuration(void)
 {
 	PGresult   *res;
 	PGconn	   *conn;
-	int			nsubs_on_old;
 	int			max_replication_slots;
 
 	/* Subscriptions and their dependencies can be migrated since PG17. */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
 		return;
 
-	nsubs_on_old = count_old_cluster_subscriptions();
-
 	/* Quick return if there are no subscriptions to be migrated. */
-	if (nsubs_on_old == 0)
+	if (old_cluster.nsubs == 0)
 		return;
 
 	prep_status("Checking for new cluster configuration for subscriptions");
@@ -1821,10 +1820,10 @@ check_new_cluster_subscription_configuration(void)
 		pg_fatal("could not determine parameter settings on new cluster");
 
 	max_replication_slots = atoi(PQgetvalue(res, 0, 0));
-	if (nsubs_on_old > max_replication_slots)
+	if (old_cluster.nsubs > max_replication_slots)
 		pg_fatal("\"max_replication_slots\" (%d) must be greater than or equal to the number of "
 				 "subscriptions (%d) on the old cluster",
-				 max_replication_slots, nsubs_on_old);
+				 max_replication_slots, old_cluster.nsubs);
 
 	PQclear(res);
 	PQfinish(conn);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 95c22a7200..c07a69b63e 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,7 +28,6 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
-static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,15 +292,8 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 
 		get_rel_infos(cluster, pDbInfo);
 
-		/*
-		 * Retrieve the logical replication slots infos and the subscriptions
-		 * count for the old cluster.
-		 */
 		if (cluster == &old_cluster)
-		{
 			get_old_cluster_logical_slot_infos(pDbInfo, live_check);
-			get_db_subscription_count(pDbInfo);
-		}
 	}
 
 	if (cluster == &old_cluster)
@@ -748,54 +740,25 @@ count_old_cluster_logical_slots(void)
 }
 
 /*
- * get_db_subscription_count()
- *
- * Gets the number of subscriptions in the database referred to by "dbinfo".
+ * get_subscription_count()
  *
- * Note: This function will not do anything if the old cluster is pre-PG17.
- * This is because before that the logical slots are not upgraded, so we will
- * not be able to upgrade the logical replication clusters completely.
+ * Gets the number of subscriptions in the cluster.
  */
-static void
-get_db_subscription_count(DbInfo *dbinfo)
+void
+get_subscription_count(ClusterInfo *cluster)
 {
 	PGconn	   *conn;
 	PGresult   *res;
 
-	/* Subscriptions can be migrated since PG17. */
-	if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
-		return;
-
-	conn = connectToServer(&old_cluster, dbinfo->db_name);
+	conn = connectToServer(cluster, "template1");
 	res = executeQueryOrDie(conn, "SELECT count(*) "
-							"FROM pg_catalog.pg_subscription WHERE subdbid = %u",
-							dbinfo->db_oid);
-	dbinfo->nsubs = atoi(PQgetvalue(res, 0, 0));
+							"FROM pg_catalog.pg_subscription");
+	cluster->nsubs = atoi(PQgetvalue(res, 0, 0));
 
 	PQclear(res);
 	PQfinish(conn);
 }
 
-/*
- * count_old_cluster_subscriptions()
- *
- * Returns the number of subscriptions for all databases.
- *
- * Note: this function always returns 0 if the old_cluster is PG16 and prior
- * because we gather subscriptions only for cluster versions greater than or
- * equal to PG17. See get_db_subscription_count().
- */
-int
-count_old_cluster_subscriptions(void)
-{
-	int			nsubs = 0;
-
-	for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
-		nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
-
-	return nsubs;
-}
-
 static void
 free_db_and_rel_infos(DbInfoArr *db_arr)
 {
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 8afe240bdf..e2b99b49fa 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -197,7 +197,6 @@ typedef struct
 											 * path */
 	RelInfoArr	rel_arr;		/* array of all user relinfos */
 	LogicalSlotInfoArr slot_arr;	/* array of all LogicalSlotInfo */
-	int			nsubs;			/* number of subscriptions */
 } DbInfo;
 
 /*
@@ -296,6 +295,7 @@ typedef struct
 	char		major_version_str[64];	/* string PG_VERSION of cluster */
 	uint32		bin_version;	/* version returned from pg_ctl */
 	const char *tablespace_suffix;	/* directory specification */
+	int			nsubs;			/* number of subscriptions */
 } ClusterInfo;
 
 
@@ -430,7 +430,7 @@ FileNameMap *gen_db_file_maps(DbInfo *old_db,
 							  const char *new_pgdata);
 void		get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
 int			count_old_cluster_logical_slots(void);
-int			count_old_cluster_subscriptions(void);
+void		get_subscription_count(ClusterInfo *cluster);
 
 /* option.c */
 
-- 
2.39.3 (Apple Git-146)

#216Amit Kapila
amit.kapila16@gmail.com
In reply to: Nathan Bossart (#215)
Re: pg_upgrade and logical replication

On Wed, Jul 24, 2024 at 1:25 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Tue, Jul 23, 2024 at 09:05:05AM +0530, Amit Kapila wrote:

Right, the other option would be to move it to the place where we call
check_old_cluster_for_valid_slots(), etc. Initially, it was kept in
the specific function (get_db_rel_and_slot_infos) as we were
mainlining the count at the per-database level but now as we are
changing that I am not sure if calling it from the same place is a
good idea. But OTOH, it is okay to keep it at the place where we
retrieve the required information from the old cluster.

I moved it to where you suggested.

One minor point is the comment atop get_subscription_count() still
refers to the function name as get_db_subscription_count().

Oops, fixed.

LGTM.

--
With Regards,
Amit Kapila.

#217Nathan Bossart
nathandbossart@gmail.com
In reply to: Amit Kapila (#216)
Re: pg_upgrade and logical replication

On Wed, Jul 24, 2024 at 11:32:47AM +0530, Amit Kapila wrote:

LGTM.

Thanks for reviewing. Committed and back-patched to v17.

--
nathan

#218Amit Kapila
amit.kapila16@gmail.com
In reply to: Nathan Bossart (#217)
Re: pg_upgrade and logical replication

On Wed, Jul 24, 2024 at 10:03 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:

On Wed, Jul 24, 2024 at 11:32:47AM +0530, Amit Kapila wrote:

LGTM.

Thanks for reviewing. Committed and back-patched to v17.

Shall we close the open items? I think even if we want to improve the
slot fetching/creation mechanism, it should be part of PG18.

--
With Regards,
Amit Kapila.

#219Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#218)
Re: pg_upgrade and logical replication

On Thu, Jul 25, 2024 at 8:41 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jul 24, 2024 at 10:03 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:

On Wed, Jul 24, 2024 at 11:32:47AM +0530, Amit Kapila wrote:

LGTM.

Thanks for reviewing. Committed and back-patched to v17.

Shall we close the open items?

Sorry for the typo. There is only one open item corresponding to this:
"Subscription and slot information retrieval inefficiency in
pg_upgrade" which according to me should be closed after your commit.

--
With Regards,
Amit Kapila.

#220Nathan Bossart
nathandbossart@gmail.com
In reply to: Amit Kapila (#219)
Re: pg_upgrade and logical replication

On Thu, Jul 25, 2024 at 08:43:03AM +0530, Amit Kapila wrote:

Shall we close the open items?

Sorry for the typo. There is only one open item corresponding to this:
"Subscription and slot information retrieval inefficiency in
pg_upgrade" which according to me should be closed after your commit.

Oops, I forgot to do that. I've moved it to the "resolved before 17beta3"
section.

--
nathan

#221Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#220)
Re: pg_upgrade and logical replication

On Wed, Jul 24, 2024 at 10:16:51PM -0500, Nathan Bossart wrote:

On Thu, Jul 25, 2024 at 08:43:03AM +0530, Amit Kapila wrote:

Shall we close the open items?

Sorry for the typo. There is only one open item corresponding to this:
"Subscription and slot information retrieval inefficiency in
pg_upgrade" which according to me should be closed after your commit.

Oops, I forgot to do that. I've moved it to the "resolved before 17beta3"
section.

Removing the item sounds good to me. Thanks.
--
Michael