Fixing the docs for ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION
-hackers,
I think commit 82ed7748b710e3ddce3f7ebc74af80fe4869492f created some confusion that should be cleaned up before release. I'd like some guidance on what the intended behavior is before I submit a patch for this, though:
+ALTER SUBSCRIPTION mysubscription SET PUBLICATION nosuchpub WITH (copy_data = false, refresh = false);
+ALTER SUBSCRIPTION mysubscription ADD PUBLICATION nosuchpub WITH (copy_data = false, refresh = false);
+ALTER SUBSCRIPTION mysubscription DROP PUBLICATION nosuchpub WITH (copy_data = false, refresh = false);
+ERROR: unrecognized subscription parameter: "copy_data"
+ALTER SUBSCRIPTION mysubscription SET (copy_data = false, refresh = false);
+ERROR: unrecognized subscription parameter: "copy_data"
First, it's quite odd to say that "copy_data" is unrecognized in the third and fourth ALTER commands when it was recognized just fine in the first two.
More than that, though, the docs in doc/src/sgml/ref/alter_subscription.sgml refer to this part of the grammar in the first three ALTER commands as a "set_publication_option", not as a "subscription_parameter", a term which is only used in the grammar for other forms of the ALTER command. Per the grammar in the docs, "copy_data" is not a valid set_publication_option, only "refresh" is.
Should the first three ALTER commands fail with an error about "copy_data" being an invalid set_publication_option? Should they succeed, in which case the docs should mention that "refresh" is not the only valid set_publication_option?
Something else, perhaps?
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Sat, May 22, 2021 at 1:49 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
-hackers,
I think commit 82ed7748b710e3ddce3f7ebc74af80fe4869492f created some confusion that should be cleaned up before release. I'd like some guidance on what the intended behavior is before I submit a patch for this, though:
+ALTER SUBSCRIPTION mysubscription SET PUBLICATION nosuchpub WITH (copy_data = false, refresh = false); +ALTER SUBSCRIPTION mysubscription ADD PUBLICATION nosuchpub WITH (copy_data = false, refresh = false); +ALTER SUBSCRIPTION mysubscription DROP PUBLICATION nosuchpub WITH (copy_data = false, refresh = false); +ERROR: unrecognized subscription parameter: "copy_data" +ALTER SUBSCRIPTION mysubscription SET (copy_data = false, refresh = false); +ERROR: unrecognized subscription parameter: "copy_data"First, it's quite odd to say that "copy_data" is unrecognized in the third and fourth ALTER commands when it was recognized just fine in the first two.
For ALTER SUBSCRIPTION ... DROP PUBLICATION, copy_data option is not
required actually, because it doesn't add new publications. If the
concern here is "why refresh is allowed but not copy_data", then the
answer is "with the refresh option the updated publications can be
refreshed, this avoids users to run REFRESH PUBLICATION after DROP
PUBLICATION". So, disallowing copy_data makes sense to me.
For ALTER SUBSCRIPTION ... SET, allowed options are slot_name,
synchronous_commit, binary and streaming which are part of
pg_subscription catalog and will be used by apply/sync workers.
Whereas copy_data and refresh are not part of pg_subscription catalog
and are not used by apply/sync workers (directly), but by the backend.
We have ALTER SUBSCRIPTION .. REFRESH specifically for refresh and
copy_data options.
More than that, though, the docs in doc/src/sgml/ref/alter_subscription.sgml refer to this part of the grammar in the first three ALTER commands as a "set_publication_option", not as a "subscription_parameter", a term which is only used in the grammar for other forms of the ALTER command. Per the grammar in the docs, "copy_data" is not a valid set_publication_option, only "refresh" is.
set_publication_option - options are refresh and copy_data (this
option comes implicitly, please see the note "Additionally, refresh
options as described under REFRESH PUBLICATION may be specified.",
under refresh_option we have copy_data)
subscription_parameter - options are slot_name, synchronous_commit,
binary, and streaming. This is correct.
Should the first three ALTER commands fail with an error about "copy_data" being an invalid set_publication_option? Should they succeed, in which case the docs should mention that "refresh" is not the only valid set_publication_option?
No that's not correct. As I said above, set_publication_option options
are both refresh and copy_data.
Something else, perhaps?
Unless I misunderstood any of your concerns, I think the existing docs
and the code looks correct to me.
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
On May 21, 2021, at 10:39 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
On Sat, May 22, 2021 at 1:49 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:-hackers,
I think commit 82ed7748b710e3ddce3f7ebc74af80fe4869492f created some confusion that should be cleaned up before release. I'd like some guidance on what the intended behavior is before I submit a patch for this, though:
+ALTER SUBSCRIPTION mysubscription SET PUBLICATION nosuchpub WITH (copy_data = false, refresh = false); +ALTER SUBSCRIPTION mysubscription ADD PUBLICATION nosuchpub WITH (copy_data = false, refresh = false); +ALTER SUBSCRIPTION mysubscription DROP PUBLICATION nosuchpub WITH (copy_data = false, refresh = false); +ERROR: unrecognized subscription parameter: "copy_data" +ALTER SUBSCRIPTION mysubscription SET (copy_data = false, refresh = false); +ERROR: unrecognized subscription parameter: "copy_data"First, it's quite odd to say that "copy_data" is unrecognized in the third and fourth ALTER commands when it was recognized just fine in the first two.
For ALTER SUBSCRIPTION ... DROP PUBLICATION, copy_data option is not
required actually, because it doesn't add new publications. If the
concern here is "why refresh is allowed but not copy_data", then the
answer is "with the refresh option the updated publications can be
refreshed, this avoids users to run REFRESH PUBLICATION after DROP
PUBLICATION". So, disallowing copy_data makes sense to me.
My concern isn't that the code is doing the wrong thing, but that the docs and the error messages are confusing. This is particularly troubling given that having a single action which combines the dropping of one publication with the refreshing of other publications is not particularly intuitive.
I agree that disallowing copy_data DROP PUBLICATION is a reasonable design choice, but I do not agree that this prohibition is intuitive. If I want to copy the data for a set of tables on a remote server, and only copy that data exactly once, I might be looking for an atomic action to do so. The docs are totally unclear on whether this is supported, so I might try:
CREATE SUBSCRIPTION tempsub CONNECTION 'dbname=remotedb' PUBLICATION remotepub WITH (connect = false, enabled = false, slot_name = NONE, create_slot = false);
ALTER SUBSCRIPTION tempsub DROP PUBLICATION remotepub WITH (refresh = true, copy_data = true);
with the intention that the data will be copied right before the publication is dropped. When I get an error that says 'unrecognized subscription parameter: "copy_data"', I'm likely to think I mistyped the parameter name, not that it is disallowed in this setting. If I then decide to just drop the publication (since my experiment didn't work) and try to do so using:
ALTER SUBSCRIPTION tempsub DROP PUBLICATION remotepub WITH (refresh = false, copy_data = false);
I seem to be playing by the rules, since I am explicitly not requesting "copy_data". That's what the "false" means. But again, the command complains that "copy_data" is unrecognized. At this point, I go back to the docs and it clearly says that "copy_data" is a supported parameter in this command. I'm totally confused.
I think the docs should say that "copy_data" is not allowed for DROP PUBLICATION. I think no error should occur for copy_data = false. For copy_data = true, I think the error message should say that copy_data is disallowed during a DROP PUBLICATION, rather than saying that the parameter is unrecognized.
For ALTER SUBSCRIPTION ... SET, allowed options are slot_name,
synchronous_commit, binary and streaming which are part of
pg_subscription catalog and will be used by apply/sync workers.
Whereas copy_data and refresh are not part of pg_subscription catalog
and are not used by apply/sync workers (directly), but by the backend.
We have ALTER SUBSCRIPTION .. REFRESH specifically for refresh and
copy_data options.More than that, though, the docs in doc/src/sgml/ref/alter_subscription.sgml refer to this part of the grammar in the first three ALTER commands as a "set_publication_option", not as a "subscription_parameter", a term which is only used in the grammar for other forms of the ALTER command. Per the grammar in the docs, "copy_data" is not a valid set_publication_option, only "refresh" is.
set_publication_option - options are refresh and copy_data (this
option comes implicitly, please see the note "Additionally, refresh
options as described under REFRESH PUBLICATION may be specified.",
under refresh_option we have copy_data)subscription_parameter - options are slot_name, synchronous_commit,
binary, and streaming. This is correct.Should the first three ALTER commands fail with an error about "copy_data" being an invalid set_publication_option? Should they succeed, in which case the docs should mention that "refresh" is not the only valid set_publication_option?
No that's not correct. As I said above, set_publication_option options
are both refresh and copy_data.
Well, not really. We're using the phrase "set_publication_option" for all three of SET PUBLICATION, ADD PUBLICATION, and DROP PUBLICATION. Since that's not really supported, we should use it only for the first two, and have a separate "drop_publication_option" for the third.
Something else, perhaps?
Unless I misunderstood any of your concerns, I think the existing docs
and the code looks correct to me.
Thanks for your response. The docs and error messages still don't look right to me.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Sat, May 22, 2021 at 10:22 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
My concern isn't that the code is doing the wrong thing, but that the docs and the error messages are confusing. This is particularly troubling given that having a single action which combines the dropping of one publication with the refreshing of other publications is not particularly intuitive.
I agree that disallowing copy_data DROP PUBLICATION is a reasonable design choice, but I do not agree that this prohibition is intuitive. If I want to copy the data for a set of tables on a remote server, and only copy that data exactly once, I might be looking for an atomic action to do so. The docs are totally unclear on whether this is supported, so I might try:
CREATE SUBSCRIPTION tempsub CONNECTION 'dbname=remotedb' PUBLICATION remotepub WITH (connect = false, enabled = false, slot_name = NONE, create_slot = false);
ALTER SUBSCRIPTION tempsub DROP PUBLICATION remotepub WITH (refresh = true, copy_data = true);with the intention that the data will be copied right before the publication is dropped. When I get an error that says 'unrecognized subscription parameter: "copy_data"', I'm likely to think I mistyped the parameter name, not that it is disallowed in this setting. If I then decide to just drop the publication (since my experiment didn't work) and try to do so using:
ALTER SUBSCRIPTION tempsub DROP PUBLICATION remotepub WITH (refresh = false, copy_data = false);
I seem to be playing by the rules, since I am explicitly not requesting "copy_data". That's what the "false" means. But again, the command complains that "copy_data" is unrecognized. At this point, I go back to the docs and it clearly says that "copy_data" is a supported parameter in this command. I'm totally confused.
I think the docs should say that "copy_data" is not allowed for DROP PUBLICATION. I think no error should occur for copy_data = false. For copy_data = true, I think the error message should say that copy_data is disallowed during a DROP PUBLICATION, rather than saying that the parameter is unrecognized.
Thanks for the detailed explanation. I think there are two
possibilities - unrecognised options and disallowed options. If a user
enters an option 'blah_blah', then the error "unrecognized
subscription parameter: "blah_blah"" is meaningful. If a user enters
'copy_data' for DROP PUBLICATION, then an error something like
""copy_data" is not allowed for ALTER SUBSCRIPTION ... DROP
PUBLICATION" will be more meaningful. If this understanding is
correct, I wonder we should also have similar change for:
postgres=# ALTER SUBSCRIPTION testsub REFRESH PUBLICATION WITH (refresh=true);
ERROR: unrecognized subscription parameter: "refresh"
postgres=# ALTER SUBSCRIPTION testsub REFRESH PUBLICATION WITH
(synchronous_commit=' ');
ERROR: unrecognized subscription parameter: "synchronous_commit"
postgres=# ALTER SUBSCRIPTION testsub SET (refresh=true);
ERROR: unrecognized subscription parameter: "refresh"
Well, not really. We're using the phrase "set_publication_option" for all three of SET PUBLICATION, ADD PUBLICATION, and DROP PUBLICATION. Since that's not really supported, we should use it only for the first two, and have a separate "drop_publication_option" for the third.
There's another thread [1], that tries to fix this, where the earlier
suggestion was to drop_publication_option, but later the agreement was
to change the "set_publication_option" to "publication_option", and
had it for SET/ADD/DROP with a note like below. If that doesn't work,
I suggest putting the thoughts there in that thread.
- Additionally, refresh options as described
- under <literal>REFRESH PUBLICATION</literal> may be specified.
+ Additionally, refresh options as described under
<literal>REFRESH PUBLICATION</literal>
+ may be specified, except for <literal>DROP PUBLICATION</literal>.
Thanks for your response. The docs and error messages still don't look right to me.
I think, for the docs part we can move the discussion to the thread
[1]: /messages/by-id/CALDaNm34qugTr5M0d1JgCgk2Qdo6LZ9UEbTBG=TBNV5QADPLWg@mail.gmail.com
[1]: /messages/by-id/CALDaNm34qugTr5M0d1JgCgk2Qdo6LZ9UEbTBG=TBNV5QADPLWg@mail.gmail.com
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
On May 22, 2021, at 10:40 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
I think, for the docs part we can move the discussion to the thread
[1], if you are okay, and have the error message discussion here.[1] - /messages/by-id/CALDaNm34qugTr5M0d1JgCgk2Qdo6LZ9UEbTBG=TBNV5QADPLWg@mail.gmail.com
Sure, and thanks for the link!
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 21.05.21 22:19, Mark Dilger wrote:
+ALTER SUBSCRIPTION mysubscription DROP PUBLICATION nosuchpub WITH (copy_data = false, refresh = false); +ERROR: unrecognized subscription parameter: "copy_data" +ALTER SUBSCRIPTION mysubscription SET (copy_data = false, refresh = false); +ERROR: unrecognized subscription parameter: "copy_data"
Better wording might be something along the lines of "subscription
parameter %s not supported in this context". I'm not sure how easy this
would be to implement, but with enough brute force it would surely be
possible.