Bug in ALTER SUBSCRIPTION ... SERVER / ... CONNECTION with broken old server
Hi,
I tested the v19 new feature CREATE SUBSCRIPTION ... SERVER yesterday, and found an issue: once the old server becomes broken, the subscription cannot be recovered by switching it to a good server.
Here is a repro:
```
evantest=# CREATE EXTENSION postgres_fdw;
CREATE EXTENSION
# Create two servers
evantest=# CREATE SERVER old_srv FOREIGN DATA WRAPPER postgres_fdw
evantest-# OPTIONS (host '127.0.0.1', dbname 'postgres', port '5432');
CREATE SERVER
evantest=# CREATE SERVER new_srv FOREIGN DATA WRAPPER postgres_fdw
evantest-# OPTIONS (host '127.0.0.1', dbname 'postgres', port '5432');
CREATE SERVER
evantest=# CREATE USER MAPPING FOR CURRENT_USER SERVER old_srv
evantest-# OPTIONS (user 'dummy', password 'dummy');
CREATE USER MAPPING
evantest=# CREATE USER MAPPING FOR CURRENT_USER SERVER new_srv
evantest-# OPTIONS (user 'dummy', password 'dummy');
CREATE USER MAPPING
# Add old_server to a subscription
evantest=# CREATE SUBSCRIPTION sub_bug SERVER old_srv
evantest-# PUBLICATION pub_does_not_matter
evantest-# WITH (connect = false, slot_name = NONE);
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and alter the subscription to refresh publications.
CREATE SUBSCRIPTION
# Break old_srv
evantest=# DROP USER MAPPING FOR CURRENT_USER SERVER old_srv;
DROP USER MAPPING
# Fail to switch to a good server because old_srv is broken
evantest=# ALTER SUBSCRIPTION sub_bug SERVER new_srv;
ERROR: user mapping not found for user "chaol", server "old_srv"
evantest=# ALTER SUBSCRIPTION sub_bug CONNECTION 'host=127.0.0.1 dbname=postgres port=5432 user=dummy password=dummy';
ERROR: user mapping not found for user "chaol", server "old_srv"
```
In AlterSubscription(), this comment made me think this is a bug:
```
/*
* Skip ACL checks on the subscription's foreign server, if any. If
* changing the server (or replacing it with a raw connection), then the
* old one will be removed anyway. If changing something unrelated,
* there's no need to do an additional ACL check here; that will be done
* by the subscription worker anyway.
*/
sub = GetSubscription(subid, false, false);
```
The comment explicitly says to skip ACL checks on the old server because it will be removed anyway.
But after looking into GetSubscription(), I found that when the aclcheck parameter is false, it still calls ForeignServerConnectionString(). I think that is the root cause of the bug.
To fix this, I worked out a solution that stores the server OID in Subscription, and only calls ForeignServerConnectionString()lazily when sub->conninfo is actually needed. I also added a new test case to cover this scenario. Without the fix, the new test fails.
See attached patch for details.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Attachments:
v1-0001-Allow-altering-subscription-server-connection-wit.patchapplication/octet-stream; name=v1-0001-Allow-altering-subscription-server-connection-wit.patch; x-unix-mode=0644Download+106-14
Dear Chao,
I tested the v19 new feature CREATE SUBSCRIPTION ... SERVER yesterday, and
found an issue: once the old server becomes broken, the subscription cannot be
recovered by switching it to a good server.
Thanks for testing. I could reproduce the same issue. In addition to yours, I found
DROP SUBSCRIPTION cannot be done anymore. To switch the connection or drop it,
I had to create the same user mapping must be created again.
```
postgres=# DROP SUBSCRIPTION sub_bug ;
ERROR: user mapping not found for user "postgres", server "old_srv"
postgres=# CREATE USER MAPPING FOR CURRENT_USER SERVER old_srv
OPTIONS (user 'dummy', password 'dummy');
CREATE USER MAPPING
postgres=# DROP SUBSCRIPTION sub_bug ;
DROP SUBSCRIPTION
```
Before deep dive to your fix, I'm unclear why dropping the active USER MAPPING is
allowed. Personally, it should be avoided anyway. Do you know why it's not restricted?
Best regards,
Hayato Kuroda
FUJITSU LIMITED
On Apr 22, 2026, at 20:35, Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote:
Dear Chao,
I tested the v19 new feature CREATE SUBSCRIPTION ... SERVER yesterday, and
found an issue: once the old server becomes broken, the subscription cannot be
recovered by switching it to a good server.Thanks for testing. I could reproduce the same issue. In addition to yours, I found
DROP SUBSCRIPTION cannot be done anymore. To switch the connection or drop it,
I had to create the same user mapping must be created again.```
postgres=# DROP SUBSCRIPTION sub_bug ;
ERROR: user mapping not found for user "postgres", server "old_srv"
postgres=# CREATE USER MAPPING FOR CURRENT_USER SERVER old_srv
OPTIONS (user 'dummy', password 'dummy');
CREATE USER MAPPING
postgres=# DROP SUBSCRIPTION sub_bug ;
DROP SUBSCRIPTION
```Before deep dive to your fix, I'm unclear why dropping the active USER MAPPING is
allowed. Personally, it should be avoided anyway. Do you know why it's not restricted?Best regards,
Hayato Kuroda
FUJITSU LIMITED
Hi Hayato-san,
There is an existing test case in subscription.sql:
```
-- fail, must connect but lacks USAGE on server, as well as user mapping
DROP SUBSCRIPTION regress_testsub6;
```
So, I guess that’s an intentional behavior. You have to fix the broken server or switch to a good one before dropping the subscription. That’s my understanding from the test cases.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
On Wed, Apr 22, 2026 at 11:52 AM Chao Li <li.evan.chao@gmail.com> wrote:
Hi,
The comment explicitly says to skip ACL checks on the old server because it will be removed anyway.
But after looking into GetSubscription(), I found that when the aclcheck parameter is false, it still calls ForeignServerConnectionString(). I think that is the root cause of the bug.
To fix this, I worked out a solution that stores the server OID in Subscription, and only calls ForeignServerConnectionString()lazily when sub->conninfo is actually needed. I also added a new test case to cover this scenario. Without the fix, the new test fails.
See attached patch for details.
Hi Li,
Thanks for the patch.
Some comments:
1.
if (aclresult != ACLCHECK_OK)
ereport(ERROR,
- (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("subscription owner \"%s\" does not
have permission on foreign server \"%s\"",
- GetUserNameFromId(subform->subowner, false),
- server->servername)));
+ errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("subscription owner \"%s\" does not
have permission on foreign server \"%s\"",
+ GetUserNameFromId(subform->subowner, false),
+ server->servername));
+ sub->conninfo = ForeignServerConnectionString(subform->subowner,
+ server);
}
Add a new line before the call to ForeignServerConnectionString(),
also I think you should put the if condition within curly brackets
because it spans more than one line and might confuse developers while
adding new code.
2. I think you should add a comment in the function header above
GetSubscription() stating that if aclcheck is false, then the conninfo
will be set to null and users need to call GetSubscriptionConnInfo to
get the conninfo.
3.
Datum
test_fdw_connection(PG_FUNCTION_ARGS)
{
+ GetUserMapping(PG_GETARG_OID(0), PG_GETARG_OID(1));
PG_RETURN_TEXT_P(cstring_to_text("dbname=regress_doesnotexist
user=doesnotexist password=secret"));
}
Add a comment above this change describing why it's required.
regards,
Ajin Cherian
Fujitsu Australia
On Apr 29, 2026, at 12:44, Ajin Cherian <itsajin@gmail.com> wrote:
On Wed, Apr 22, 2026 at 11:52 AM Chao Li <li.evan.chao@gmail.com> wrote:
Hi,
The comment explicitly says to skip ACL checks on the old server because it will be removed anyway.
But after looking into GetSubscription(), I found that when the aclcheck parameter is false, it still calls ForeignServerConnectionString(). I think that is the root cause of the bug.
To fix this, I worked out a solution that stores the server OID in Subscription, and only calls ForeignServerConnectionString()lazily when sub->conninfo is actually needed. I also added a new test case to cover this scenario. Without the fix, the new test fails.
See attached patch for details.Hi Li,
Thanks for the patch.
Thanks for your review.
Some comments:
1. if (aclresult != ACLCHECK_OK) ereport(ERROR, - (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), - errmsg("subscription owner \"%s\" does not have permission on foreign server \"%s\"", - GetUserNameFromId(subform->subowner, false), - server->servername))); + errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("subscription owner \"%s\" does not have permission on foreign server \"%s\"", + GetUserNameFromId(subform->subowner, false), + server->servername)); + sub->conninfo = ForeignServerConnectionString(subform->subowner, + server); }Add a new line before the call to ForeignServerConnectionString(),
Okay.
also I think you should put the if condition within curly brackets
because it spans more than one line and might confuse developers while
adding new code.
I don’t think curly brackets are needed as there is only one statement under the if though the statement spans multiple lines. There are plenty of examples in the existing code, for example:
```
if (!wrconn)
ereport(ERROR,
errcode(ERRCODE_CONNECTION_FAILURE),
errmsg("subscription \"%s\" could not connect to the publisher: %s",
sub->name, err));
```
2. I think you should add a comment in the function header above
GetSubscription() stating that if aclcheck is false, then the conninfo
will be set to null and users need to call GetSubscriptionConnInfo to
get the conninfo.
Updated the header comment.
3.
Datum
test_fdw_connection(PG_FUNCTION_ARGS)
{
+ GetUserMapping(PG_GETARG_OID(0), PG_GETARG_OID(1));
PG_RETURN_TEXT_P(cstring_to_text("dbname=regress_doesnotexist
user=doesnotexist password=secret"));
}Add a comment above this change describing why it's required.
Added the comment.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Attachments:
v2-0001-Allow-altering-subscription-server-connection-wit.patchapplication/octet-stream; name=v2-0001-Allow-altering-subscription-server-connection-wit.patch; x-unix-mode=0644Download+116-14
On Thu, Apr 30, 2026 at 2:12 PM Chao Li <li.evan.chao@gmail.com> wrote:
2. I think you should add a comment in the function header above
GetSubscription() stating that if aclcheck is false, then the conninfo
will be set to null and users need to call GetSubscriptionConnInfo to
get the conninfo.Updated the header comment.
/*
* Fetch the subscription from the syscache.
+ *
+ * If missing_ok is false, throw an error if the subscription is not found.
+ * If true, return NULL in that case.
+ *
+ * If aclcheck is true, check whether the subscription owner has permission on
+ * the subscription's foreign server, and load the connection string from the
+ * foreign server. Later, GetSubscriptionConnInfo() should be called to get
+ * the connection string.
*/
Subscription *
GetSubscription(Oid subid, bool missing_ok, bool aclcheck)
I don't think this comment is right. If aclcheck is true, users need
not call GetSubscriptionConnInfo() to get the connection string, as it
is populated in this function itself. It is only if aclecheck is
false, do callers need to do that.
Isn't that the case? There are multiple places in the apply worker
where GetSubscription() is called with aclcheck is true and
GetSubscriptionConnInfo() is not called there.
regards,
Ajin Cherian
Fujitsu Australia
Hello
- server = GetForeignServer(subform->subserver);
+ server = GetForeignServer(sub->serverid);
Couldn't we also move this inside the if?
+/*
+ * Return the subscription's connection string, loading it into the
+ * subscription memory context if necessary.
+ *
+ * GetSubscription must be called earlier to set sub->serverid, because ACL
+ * checks are performed there.
+ */
+char *
+GetSubscriptionConnInfo(Subscription *sub)
This is related to Ajin's comment earlier, the part about ACL check
seems incorrect to me.
On May 1, 2026, at 11:58, Ajin Cherian <itsajin@gmail.com> wrote:
On Thu, Apr 30, 2026 at 2:12 PM Chao Li <li.evan.chao@gmail.com> wrote:
2. I think you should add a comment in the function header above
GetSubscription() stating that if aclcheck is false, then the conninfo
will be set to null and users need to call GetSubscriptionConnInfo to
get the conninfo.Updated the header comment.
/* * Fetch the subscription from the syscache. + * + * If missing_ok is false, throw an error if the subscription is not found. + * If true, return NULL in that case. + * + * If aclcheck is true, check whether the subscription owner has permission on + * the subscription's foreign server, and load the connection string from the + * foreign server. Later, GetSubscriptionConnInfo() should be called to get + * the connection string. */ Subscription * GetSubscription(Oid subid, bool missing_ok, bool aclcheck)I don't think this comment is right. If aclcheck is true, users need
not call GetSubscriptionConnInfo() to get the connection string, as it
is populated in this function itself. It is only if aclecheck is
false, do callers need to do that.
Isn't that the case? There are multiple places in the apply worker
where GetSubscription() is called with aclcheck is true and
GetSubscriptionConnInfo() is not called there.regards,
Ajin Cherian
Fujitsu Australia
Hi Ajin,
Thank you for the comment. After rereading that part, I agree the wording is not clear.
What I meant is that GetSubscriptionConnInfo() is a safe accessor, if sub->conninfo has already been resolved, it just returns it; otherwise, it resolves it on demand. So the intended usage is that callers can consistently use GetSubscriptionConnInfo() when they need the connection string.
I also missed doing a broader search for direct uses of sub->conninfo and replacing them with GetSubscriptionConnInfo() where appropriate. Sorry about that.
I will address both issues in v3.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
On May 6, 2026, at 04:53, Zsolt Parragi <zsolt.parragi@percona.com> wrote:
Hello
- server = GetForeignServer(subform->subserver); + server = GetForeignServer(sub->serverid);Couldn't we also move this inside the if?
Ah, true. Both aclresult and server can be moved into the if.
+/* + * Return the subscription's connection string, loading it into the + * subscription memory context if necessary. + * + * GetSubscription must be called earlier to set sub->serverid, because ACL + * checks are performed there. + */ +char * +GetSubscriptionConnInfo(Subscription *sub)This is related to Ajin's comment earlier, the part about ACL check
seems incorrect to me.
Yes, see my reply to Ajin in the previous email.
PFA v3 - addressed Ajin and Zsolt’s comments.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Attachments:
v3-0001-Allow-altering-subscription-server-connection-wit.patchapplication/octet-stream; name=v3-0001-Allow-altering-subscription-server-connection-wit.patch; x-unix-mode=0644Download+130-25
On Wed, 2026-05-06 at 15:57 +0800, Chao Li wrote:
PFA v3 - addressed Ajin and Zsolt’s comments.
Thank you for the report!
The proposed patch seems unnecessarily complex, though. It seems too
easy to add GetSubscriptionConninfo() in the wrong place and end up
with another problem that's not easily detected.
Can't we just do something like the attached? It's easy to explain at
the call site that, when changing to a different server or using
CONNECTION instead, that we don't need the old conninfo at all.
I included your test case in my patch, and it passes.
Also, Hayato Kuroda's report was an issue also because the error could
be thrown even if slotname was NULL. Patch attached for that, as well.
Thank you, also!
Regards,
Jeff Davis
Attachments:
v4-0001-Avoid-errors-during-ALTER-SUBSCRIPTION.patchtext/x-patch; charset=UTF-8; name=v4-0001-Avoid-errors-during-ALTER-SUBSCRIPTION.patchDownload+92-40
v4-0002-Avoid-errors-during-DROP-SUBSCRIPTION.patchtext/x-patch; charset=UTF-8; name=v4-0002-Avoid-errors-during-DROP-SUBSCRIPTION.patchDownload+37-35
On May 9, 2026, at 09:01, Jeff Davis <pgsql@j-davis.com> wrote:
On Wed, 2026-05-06 at 15:57 +0800, Chao Li wrote:
PFA v3 - addressed Ajin and Zsolt’s comments.
Thank you for the report!
The proposed patch seems unnecessarily complex, though. It seems too
easy to add GetSubscriptionConninfo() in the wrong place and end up
with another problem that's not easily detected.Can't we just do something like the attached? It's easy to explain at
the call site that, when changing to a different server or using
CONNECTION instead, that we don't need the old conninfo at all.I included your test case in my patch, and it passes.
Also, Hayato Kuroda's report was an issue also because the error could
be thrown even if slotname was NULL. Patch attached for that, as well.
Thank you, also!Regards,
Jeff Davis<v4-0001-Avoid-errors-during-ALTER-SUBSCRIPTION.patch><v4-0002-Avoid-errors-during-DROP-SUBSCRIPTION.patch>
Ah, I see. You added a new conninfo_needed parameter to GetSubscription(), which separates the decision of building conninfo from the ACL check. Cool, I believe this is a better approach.
So 0001 looks good to me. nitpick is that conninfo_aclcheck is now only meaningful when conninfo_needed is true. I wonder if we should mention that briefly in the function header comment, or add an assertion such as: Assert(conninfo_needed || !conninfo_aclcheck); to avoid possible misuse of conninfo_aclcheck in the future.
For 0002, I have a doubt. Now conninfo is built only when slotname is not NULL. But after reading through DropSubscription(), I am not sure conninfo is strictly tied to slotname.
For example, this fast path returns only when both slotname is NULL and rstates is NIL:
```
/*
* If there is no slot associated with the subscription, we can finish
* here.
*/
if (!slotname && rstates == NIL)
{
table_close(rel, NoLock);
return;
}
```
That seems to imply that even when slotname is NULL, rstates might still be not NIL.
Later, if conninfo is not NULL, the code connects to the publisher and does some cleanup work for tablesync slots:
```
if (conninfo)
wrconn = walrcv_connect(conninfo, true, true, must_use_password,
subname, &err);
...
/*
* Drop the tablesync slots associated with removed tables.
*
* For SYNCDONE/READY states, the tablesync slot is known to have
* already been dropped by the tablesync worker.
*
* For other states, there is no certainty, maybe the slot does
* not exist yet. Also, if we fail after removing some of the
* slots, next time, it will again try to drop already dropped
* slots and fail. For these reasons, we allow missing_ok = true
* for the drop.
*/
if (rstate->state != SUBREL_STATE_SYNCDONE)
{
char syncslotname[NAMEDATALEN] = {0};
ReplicationSlotNameForTablesync(subid, relid, syncslotname,
sizeof(syncslotname));
ReplicationSlotDropAtPubNode(wrconn, syncslotname, true);
}
```
So with 0002, if slotname is NULL but rstates is not NIL, it looks possible that we no longer build conninfo and therefore skip the cleanup on the publisher side.
Best reagards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Hi,
I added a fix to the series: v5-0001 fixes check_pub_rdt for the
foreign server case.
On Sat, 2026-05-09 at 11:08 +0800, Chao Li wrote:
Ah, I see. You added a new conninfo_needed parameter to
GetSubscription(), which separates the decision of building conninfo
from the ACL check. Cool, I believe this is a better approach.So 0001 looks good to me. nitpick is that conninfo_aclcheck is now
only meaningful when conninfo_needed is true. I wonder if we should
mention that briefly in the function header comment, or add an
assertion such as: Assert(conninfo_needed || !conninfo_aclcheck); to
avoid possible misuse of conninfo_aclcheck in the future.
I refactored the fix in v5-0002 to do this in a more organized way: now
all option parsing happens first, so I can more precisely decide which
paths need conninfo and which ones don't.
So with 0002, if slotname is NULL but rstates is not NIL, it looks
possible that we no longer build conninfo and therefore skip the
cleanup on the publisher side.
I separated this into two patches:
v5-0003 just moves the connection string building after the early exit,
so that if slotname is NONE and rstates is NIL, then it won't try to
build the connection string at all (and therefore won't get an error
while doing so).
v5-0004 fixes the remaining issue when slotname is NONE and rstates is
*not* NIL. It uses a subtransaction to catch the error, so that the
DROP TRANSACTION will still succeed even though it can't connect to the
publisher to drop the tablesync slots. This feels a bit over-
engineered, but it does maintain the expected behavior in this case. It
also routes errors inside of ForeignServerConnectionString() through
ReportSlotConnectionError(), which adds a helpful hint.
Regards,
Jeff Davis
Attachments:
v5-0001-Check-retain_dead_tuples-for-ALTER-SUBSCRIPTION-..patchtext/x-patch; charset=UTF-8; name=v5-0001-Check-retain_dead_tuples-for-ALTER-SUBSCRIPTION-..patchDownload+15-7
v5-0002-Avoid-errors-during-ALTER-SUBSCRIPTION.patchtext/x-patch; charset=UTF-8; name=v5-0002-Avoid-errors-during-ALTER-SUBSCRIPTION.patchDownload+181-72
v5-0003-Avoid-errors-during-DROP-SUBSCRIPTION-when-slot_n.patchtext/x-patch; charset=UTF-8; name=v5-0003-Avoid-errors-during-DROP-SUBSCRIPTION-when-slot_n.patchDownload+57-40
v5-0004-DROP-SUBSCRIPTION-improve-error-message.patchtext/x-patch; charset=UTF-8; name=v5-0004-DROP-SUBSCRIPTION-improve-error-message.patchDownload+47-22
On May 15, 2026, at 05:45, Jeff Davis <pgsql@j-davis.com> wrote:
Hi,
I added a fix to the series: v5-0001 fixes check_pub_rdt for the
foreign server case.On Sat, 2026-05-09 at 11:08 +0800, Chao Li wrote:
Ah, I see. You added a new conninfo_needed parameter to
GetSubscription(), which separates the decision of building conninfo
from the ACL check. Cool, I believe this is a better approach.So 0001 looks good to me. nitpick is that conninfo_aclcheck is now
only meaningful when conninfo_needed is true. I wonder if we should
mention that briefly in the function header comment, or add an
assertion such as: Assert(conninfo_needed || !conninfo_aclcheck); to
avoid possible misuse of conninfo_aclcheck in the future.I refactored the fix in v5-0002 to do this in a more organized way: now
all option parsing happens first, so I can more precisely decide which
paths need conninfo and which ones don't.So with 0002, if slotname is NULL but rstates is not NIL, it looks
possible that we no longer build conninfo and therefore skip the
cleanup on the publisher side.I separated this into two patches:
v5-0003 just moves the connection string building after the early exit,
so that if slotname is NONE and rstates is NIL, then it won't try to
build the connection string at all (and therefore won't get an error
while doing so).v5-0004 fixes the remaining issue when slotname is NONE and rstates is
*not* NIL. It uses a subtransaction to catch the error, so that the
DROP TRANSACTION will still succeed even though it can't connect to the
publisher to drop the tablesync slots. This feels a bit over-
engineered, but it does maintain the expected behavior in this case. It
also routes errors inside of ForeignServerConnectionString() through
ReportSlotConnectionError(), which adds a helpful hint.Regards,
Jeff Davis<v5-0001-Check-retain_dead_tuples-for-ALTER-SUBSCRIPTION-..patch><v5-0002-Avoid-errors-during-ALTER-SUBSCRIPTION.patch><v5-0003-Avoid-errors-during-DROP-SUBSCRIPTION-when-slot_n.patch><v5-0004-DROP-SUBSCRIPTION-improve-error-message.patch>
I have just one comment on v5:
In 0002, for both ALTER_SUBSCRIPTION_SERVER and ALTER_SUBSCRIPTION_CONNECTION, conninfo_needed is false:
```
if (stmt->kind == ALTER_SUBSCRIPTION_SERVER ||
stmt->kind == ALTER_SUBSCRIPTION_CONNECTION)
{
conninfo_needed = false;
}
```
0001 adds "check_pub_rdt = sub->retaindeadtuples;" to the both paths:
0002 adds this Assert:
```
if (update_failover || update_two_phase || check_pub_rdt)
{
bool must_use_password;
char *err;
WalReceiverConn *wrconn;
Assert(conninfo_needed);
```
So, for those two paths, if check_pub_rdt is true, then the Assert will be fired, is that intentional?
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
On Fri, 2026-05-15 at 15:18 +0800, Chao Li wrote:
0002 adds this Assert:
```
if (update_failover || update_two_phase || check_pub_rdt)
{
bool must_use_password;
char *err;
WalReceiverConn *wrconn;Assert(conninfo_needed);
```So, for those two paths, if check_pub_rdt is true, then the Assert
will be fired, is that intentional?
I've fixed it to be Assert(new_conninfo || orig_conninfo_needed).
Also, the code above was missing the case of SUBOPT_ORIGIN which could
set check_pub_rdt. I changed it to be more conservative and set
orig_conninfo_needed=false when one of:
ALTER SUBSCRIPTION ... SERVER
ALTER SUBSCRIPTION ... CONNECTION
ALTER SUBSCRIPTION ... SET (slot_name=NONE)
and not try to be precise about which other settings might need
check_pub_rdt or not.
What do you think of v6-0003? Is it over-engineered? Should the
subtransaction happen at a lower level? Is there an alternative to
using a subtransaction?
Regards,
Jeff Davis
Attachments:
v6-0001-Avoid-errors-during-ALTER-SUBSCRIPTION.patchtext/x-patch; charset=UTF-8; name=v6-0001-Avoid-errors-during-ALTER-SUBSCRIPTION.patchDownload+176-72
v6-0002-Avoid-errors-during-DROP-SUBSCRIPTION-when-slot_n.patchtext/x-patch; charset=UTF-8; name=v6-0002-Avoid-errors-during-DROP-SUBSCRIPTION-when-slot_n.patchDownload+57-40
v6-0003-DROP-SUBSCRIPTION-improve-error-message.patchtext/x-patch; charset=UTF-8; name=v6-0003-DROP-SUBSCRIPTION-improve-error-message.patchDownload+47-22
On May 16, 2026, at 07:18, Jeff Davis <pgsql@j-davis.com> wrote:
On Fri, 2026-05-15 at 15:18 +0800, Chao Li wrote:
0002 adds this Assert:
```
if (update_failover || update_two_phase || check_pub_rdt)
{
bool must_use_password;
char *err;
WalReceiverConn *wrconn;Assert(conninfo_needed);
```So, for those two paths, if check_pub_rdt is true, then the Assert
will be fired, is that intentional?I've fixed it to be Assert(new_conninfo || orig_conninfo_needed).
Also, the code above was missing the case of SUBOPT_ORIGIN which could
set check_pub_rdt. I changed it to be more conservative and set
orig_conninfo_needed=false when one of:ALTER SUBSCRIPTION ... SERVER
ALTER SUBSCRIPTION ... CONNECTION
ALTER SUBSCRIPTION ... SET (slot_name=NONE)and not try to be precise about which other settings might need
check_pub_rdt or not.
Yep, this part looks good now.
What do you think of v6-0003? Is it over-engineered? Should the
subtransaction happen at a lower level? Is there an alternative to
using a subtransaction?
For the reason you described in the commit message, catching the error and later reporting it through ReportSlotConnectionError(), I don't think this is over-engineered. I also think keeping the subtransaction inside construct_subserver_conninfo() is reasonable, because this error-capturing behavior seems specific to the DROP SUBSCRIPTION path.
As for whether the HINT itself really helps, I would reserve my opinion. As the test output shows:
```
DROP SUBSCRIPTION regress_testsub6;
ERROR: could not connect to publisher when attempting to drop replication slot "dummy": user mapping not found for user "regress_subscription_user3", server "test_server"
HINT: Use ALTER SUBSCRIPTION ... DISABLE to disable the subscription, and then use ALTER SUBSCRIPTION ... SET (slot_name = NONE) to disassociate it from the slot.
```
The error message already says that the problem is “user mapping not found”, so fixing the user mapping could also be a solution. So, the HINT is still useful, but it might not be the most direct fix in some case.
I got another small comment. Now construct_subserver_conninfo() has some duplicate code block of getting foreign server, aclcheck and ForeignServerConnectionString as what GetSubscription() has, maybe we can wrap that piece of code into a helper function.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/