BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

Started by PG Bug reporting formover 1 year ago20 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 18644
Logged by: Maxim Boguk
Email address: maxim.boguk@gmail.com
PostgreSQL version: 15.8
Operating system: Ubuntu
Description:

Hi,

When ALTER PUBLICATION ... SET (publish_via_partition_root) executed on the
existing logical replication with data
(following ALTER SUBSCRIPTION ... REFRESH PUBLICATION), the database start
copy whole partitioned table from start (thus breaking existing logical
replication).
What's worse - I didn't found any way get out of such situation less than
redo all multi-TB logical replication from blank db.

Also ALTER SUBSCRIPTION ... REFRESH PUBLICATION (copy_data=false) - cannot
be used as workaround because it lead to loose any changes in partitioned
table between run ALTER PUBLICATION and ALTER SUBSCRIPTION.

Afterthought this behavior not surprising at all but I think better to
document it somewhere (or even better disable ALTER PUBLICATION ... SET
(publish_via_partition_root) for any publication with existing subscriptions
because it will break things for sure).

After I look into pub/sub code - I feel it will be very complicated task to
make it work correctly.

--
Maxim

#2Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: PG Bug reporting form (#1)
RE: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

Dear Maxim,

Thanks for reporting the issue. Before discussing more about it, I want to confirm
the exact workload you did. Is below sequence same as you expected?

0. There is a table which have leaf tables on publisher/subscriber.
1. create a PUBLICATION and include a root table
2. create a SUBSCRIPTION to subscribe above
3. insert some data on pub and wait until the replication is done.
4. do ALTER PUBLICAITON WITH (publish_via_partition_root = true)
5. insert tuples on the publisher
-> tuples won't be replicated until ALTER SUBSCRIPTION REFRESH PUBLICATION is done.

If the copy_data is true for REFRESH, all tuples inserted at 3. will be copied again.
Otherwise, data between ALTER PUB. - ALTER SUB. will be lost.

Attached .pl file (renamed to .txt due to the security reason) can reproduce and
test the situation via our test framework.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Attachments:

034_alter_pub.txttext/plain; name=034_alter_pub.txtDownload
#3Amit Kapila
amit.kapila16@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

On Wed, Oct 2, 2024 at 12:14 AM PG Bug reporting form
<noreply@postgresql.org> wrote:

The following bug has been logged on the website:

...

When ALTER PUBLICATION ... SET (publish_via_partition_root) executed on the
existing logical replication with data
(following ALTER SUBSCRIPTION ... REFRESH PUBLICATION), the database start
copy whole partitioned table from start (thus breaking existing logical
replication).
What's worse - I didn't found any way get out of such situation less than
redo all multi-TB logical replication from blank db.

Also ALTER SUBSCRIPTION ... REFRESH PUBLICATION (copy_data=false) - cannot
be used as workaround because it lead to loose any changes in partitioned
table between run ALTER PUBLICATION and ALTER SUBSCRIPTION.

Afterthought this behavior not surprising at all but I think better to
document it somewhere.

Can you tell us what additional information you want to document other
than what is already documented in CREATE PUBLICATION [1]https://www.postgresql.org/docs/devel/sql-createpublication.html -- With Regards, Amit Kapila. related to
this parameter?

It would be useful if you can create a small test case to show the
exact problem and what is your usecase for the same?

[1]: https://www.postgresql.org/docs/devel/sql-createpublication.html -- With Regards, Amit Kapila.
--
With Regards,
Amit Kapila.

#4Maxim Boguk
maxim.boguk@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#2)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

On Fri, Oct 4, 2024 at 7:30 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Maxim,

Thanks for reporting the issue. Before discussing more about it, I want to confirm
the exact workload you did. Is below sequence same as you expected?

Yes it is exactly the my case/sequence of actions (but with 10TB database ().
Thank you for creating a reproducer (I was working on it but due time
constraints I didn't finish it before you posted your test case).

--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678

#5Maxim Boguk
maxim.boguk@gmail.com
In reply to: Amit Kapila (#3)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

Can you tell us what additional information you want to document other
than what is already documented in CREATE PUBLICATION [1] related to
this parameter?

At least put a warning that alter publication set
(publish_via_partition_root) will unrecoverable break any existing
logical replication subscribers if there at least one partitioned
table with data in replication set,
so alter publication set (publish_via_partition_root) should be run
only before any subscription activation.
Ideally - make this use case actually working (but looking into the
code - it would be very complicated I feel).

It would be useful if you can create a small test case to show the exact problem and what is your usecase for the same?

Usecase - after initial load of logical replica I decided that on the
replica I better split future data into weekly partitions due huge
size (instead of monthly partitions on the master/publisher)
exactly the case for "alter publication set (publish_via_partition_root)".

My main issues with this case - there is no way to fix this problem if
it happened less than reloading whole logical replication from blank.

--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678

#6Amit Kapila
amit.kapila16@gmail.com
In reply to: Maxim Boguk (#5)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

On Fri, Oct 4, 2024 at 7:38 PM Maxim Boguk <maxim.boguk@gmail.com> wrote:

It would be useful if you can create a small test case to show the exact problem and what is your usecase for the same?

Usecase - after initial load of logical replica I decided that on the
replica I better split future data into weekly partitions due huge
size (instead of monthly partitions on the master/publisher)
exactly the case for "alter publication set (publish_via_partition_root)".

My main issues with this case - there is no way to fix this problem if
it happened less than reloading whole logical replication from blank.

You can prevent the problem by avoiding writes to the partitioned
tables between the Alter Pub and Alter Sub steps. One idea could be
that in a parallel session on publisher lock the parent table in
Access Exclusive mode till the Alter Sub command (with
copy_data=false) is finished.

--
With Regards,
Amit Kapila.

#7Maxim Boguk
maxim.boguk@gmail.com
In reply to: Amit Kapila (#6)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

On Tue, Oct 8, 2024 at 12:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 4, 2024 at 7:38 PM Maxim Boguk <maxim.boguk@gmail.com> wrote:

It would be useful if you can create a small test case to show the exact problem and what is your usecase for the same?

Usecase - after initial load of logical replica I decided that on the
replica I better split future data into weekly partitions due huge
size (instead of monthly partitions on the master/publisher)
exactly the case for "alter publication set (publish_via_partition_root)".

My main issues with this case - there is no way to fix this problem if
it happened less than reloading whole logical replication from blank.

You can prevent the problem by avoiding writes to the partitioned
tables between the Alter Pub and Alter Sub steps. One idea could be
that in a parallel session on publisher lock the parent table in
Access Exclusive mode till the Alter Sub command (with
copy_data=false) is finished.

--
With Regards,
Amit Kapila.

Thank you, it should be work. Unfortunately my English writing now is
not good enough to suggest correct and easy to understand warnings in
the documentation about this issue.

--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678

#8Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Maxim Boguk (#7)
RE: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

Dear Maxim, Amit,

Thank you, it should be work. Unfortunately my English writing now is
not good enough to suggest correct and easy to understand warnings in
the documentation about this issue.

I've written a doc patch to add a caution in ALTER PUBLICATION page. How do you feel?

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Attachments:

add_caution.diffsapplication/octet-stream; name=add_caution.diffsDownload+16-0
#9vignesh C
vignesh21@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#8)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

On Wed, 9 Oct 2024 at 11:01, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Maxim, Amit,

Thank you, it should be work. Unfortunately my English writing now is
not good enough to suggest correct and easy to understand warnings in
the documentation about this issue.

I've written a doc patch to add a caution in ALTER PUBLICATION page. How do you feel?

Few comments:
1) How about changing:
+      <para>
+       Altering the <literal>publish_via_partition_root</literal> during the
+       replication can lead to data loss or duplication. This is because the
+       publisher replicates changes as the partitioned table after the
+       altering, but it is not listed in
+       <link linkend="catalog-pg-subscription-rel"><literal>pg_subscription_rel</literal></link>.
+      </para>

to:
When the partition root table is specified as the replication target
instead of its leaf tables, changing the publish_via_partition_root
parameter during replication can cause data loss or duplication. This
happens because the subscriber initially subscribed to the leaf
tables. After running the ALTER PUBLICATION ... SET and ALTER
SUBSCRIPTION ... REFRESH PUBLICATION command, the subscriber will
subscribe to the partition root table.

2) Should we change:
+       To prevent the issue, you can avoid modifying the partitioned table
+       between the <command>ALTER PUBLICATION ... SET</command> and
+       <link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
to:
+       To prevent the issue, you can avoid modifying the partitioned table
+       between the <command>ALTER PUBLICATION ... SET</command> and
+       <link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... REFRESH PUBLICATION</command></link> and refresh
publication with copy_data as off.

Regards,
Vignesh

#10Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: vignesh C (#9)
RE: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

Dear Vignesh,

Thanks for giving comments! Since I'm not a native I followed your points.
I ran Grammaly just in case and it said OK.

PSA new version.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Attachments:

v2_add_caution.diffsapplication/octet-stream; name=v2_add_caution.diffsDownload+18-0
#11Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#10)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

On Wed, Oct 9, 2024 at 6:04 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Thanks for giving comments! Since I'm not a native I followed your points.
I ran Grammaly just in case and it said OK.

PSA new version.

+     <caution>
+      <para>
+       When the partition root table is specified as the replication target
+       instead of its leaf tables, altering the
+       <literal>publish_via_partition_root</literal> parameter during
+       replication can cause data loss or duplication. This happens because
+       the subscriber initially subscribed to the leaf tables.

This assumes that the user is always changing
publish_via_partition_root from 'false' to 'true'. Can't she change
from 'false' to 'true' as well?

+      <para>
+       To prevent the issue, you can avoid modifying the partitioned table
+       between the <command>ALTER PUBLICATION ... SET</command> and

Can't the problem happen when any of the leaf tables are modified? If
so, that is not clear from the above statement.

--
With Regards,
Amit Kapila.

#12Maxim Boguk
maxim.boguk@gmail.com
In reply to: Amit Kapila (#11)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
+      <para>
+       To prevent the issue, you can avoid modifying the partitioned table
+       between the <command>ALTER PUBLICATION ... SET</command> and

Can't the problem happen when any of the leaf tables are modified? If
so, that is not clear from the above statement.

Problem also happens if publication is created via FOR ALL TABLES /
FOR TABLES IN SCHEMA (without directly including partition head into
publication).

I have a suspicion that including partition head (directly or via FOR
ALL TABLES) into the initial publication - really required to get a
problem.
What is expected behavior in case when only partitions are included
into publication/subscription set, but partition head isn't included
and publish_via_partition_root='true' executed?

--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678

#13Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Amit Kapila (#11)
RE: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

Dear Amit,

Thanks for giving comments. PSA new version patch.

+     <caution>
+      <para>
+       When the partition root table is specified as the replication target
+       instead of its leaf tables, altering the
+       <literal>publish_via_partition_root</literal> parameter during
+       replication can cause data loss or duplication. This happens because
+       the subscriber initially subscribed to the leaf tables.

This assumes that the user is always changing
publish_via_partition_root from 'false' to 'true'. Can't she change
from 'false' to 'true' as well?

Yeah, users can change for arbitrary directions, and the issue can happen in any
cases. Toggling the option changes the oid contained in the replication messages,
this causes a loss. I adjusted a description.

+      <para>
+       To prevent the issue, you can avoid modifying the partitioned table
+       between the <command>ALTER PUBLICATION ... SET</command>
and

Can't the problem happen when any of the leaf tables are modified? If
so, that is not clear from the above statement.

Partition root table is a virtual object, and actual changes are done at its leaf.
Based on that "modifying the partitioned table" may be odd. How about "modifying
any leaf tables"?

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Attachments:

v3_add_caution.diffsapplication/octet-stream; name=v3_add_caution.diffsDownload+20-0
#14Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Maxim Boguk (#12)
RE: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

Dear Maxim,

Problem also happens if publication is created via FOR ALL TABLES /
FOR TABLES IN SCHEMA (without directly including partition head into
publication).

Thanks for pointing out. I also confirmed this issue can happen for
FOR ALL TABLES/FOR TABLES IN SCHEMA publications. I modified a documentation.

What is expected behavior in case when only partitions are included
into publication/subscription set, but partition head isn't included
and publish_via_partition_root='true' executed?

Just to confirm - you meant like below definitions, right?

```
create table tab (a int) partition by range (a);
create table tab_1 partition of tab for values from (-10) to (0);
create table tab_2 partition of tab for values from (0) to (10);
create publication pub for table tab_1, tab_2 with (publish_via_partition_root = true);
```

For now, changes are replicated as leaf table's one in above case. And I think
it is the expected behavior because users expressly specifies them.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#15Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#14)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

On Wed, Oct 23, 2024 at 3:42 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Problem also happens if publication is created via FOR ALL TABLES /
FOR TABLES IN SCHEMA (without directly including partition head into
publication).

Thanks for pointing out. I also confirmed this issue can happen for
FOR ALL TABLES/FOR TABLES IN SCHEMA publications. I modified a documentation.

I don't think we need to mention this in docs, the following change:
"For now, changes are replicated as leaf table's one in above case.
And I think it is the expected behavior because users expressly
specifies them." added by you appears odd to me.

--
With Regards,
Amit Kapila.

#16Peter Smith
smithpb2250@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#13)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

Hi Kuroda-san,

This is only a caution for ALTER of 'publish_via_partition_root', so
that should be immediately clear up-front in the first few words of
the text.

Also, the current explanation appears to give more details about how
and why it happens than I felt was necessary. e.g. this is for user
documentation, not for a code comment. Basically, I'm wondering if the
whole thing could be "dumbed down" a little bit, keeping only the
information essential so the users can protect themselves. Maybe
something a bit more like below?

SUGGESTION
<caution>
<para>
Altering the <literal>publish_via_partition_root</literal> parameter
can lead to data loss or duplication at the subscriber because it
changes the identity and schema of the published tables.
</para>
<para>
This problem can be avoided by refraining from modifying
partition leaf tables
after the <command>ALTER PUBLICATION ... SET</command> until the
<link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... REFRESH PUBLICATION</command></link>
is executed, and by only refreshing using the <literal>copy_data
= off</literal> option.
</para>
</caution>

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#17Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Smith (#16)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

On Fri, Oct 25, 2024 at 6:12 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Kuroda-san,

This is only a caution for ALTER of 'publish_via_partition_root', so
that should be immediately clear up-front in the first few words of
the text.

Also, the current explanation appears to give more details about how
and why it happens than I felt was necessary. e.g. this is for user
documentation, not for a code comment. Basically, I'm wondering if the
whole thing could be "dumbed down" a little bit, keeping only the
information essential so the users can protect themselves. Maybe
something a bit more like below?

SUGGESTION
<caution>
<para>
Altering the <literal>publish_via_partition_root</literal> parameter
can lead to data loss or duplication at the subscriber because it
changes the identity and schema of the published tables.
</para>

This appears precise but lacks the key information that the problem
can happen when a partitioned root table is specified as a replication
target. So, how about one of the following:

* Altering the <literal>publish_via_partition_root</literal> parameter
when the partition root table is specified as the replication target
can lead to data loss or duplication at the subscriber because it
changes the identity and schema of the published tables.

* Altering the <literal>publish_via_partition_root</literal> parameter
can lead to data loss or duplication at the subscriber because it
changes the identity and schema of the published tables. Note this
happens only when the partition root table is specified as the
replication target.

<para>
This problem can be avoided by refraining from modifying
partition leaf tables
after the <command>ALTER PUBLICATION ... SET</command> until the
<link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... REFRESH PUBLICATION</command></link>
is executed, and by only refreshing using the <literal>copy_data
= off</literal> option.
</para>
</caution>

We can keep this part as you proposed.

--
With Regards,
Amit Kapila.

#18Peter Smith
smithpb2250@gmail.com
In reply to: Amit Kapila (#17)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

On Fri, Oct 25, 2024 at 2:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 25, 2024 at 6:12 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Kuroda-san,

This is only a caution for ALTER of 'publish_via_partition_root', so
that should be immediately clear up-front in the first few words of
the text.

Also, the current explanation appears to give more details about how
and why it happens than I felt was necessary. e.g. this is for user
documentation, not for a code comment. Basically, I'm wondering if the
whole thing could be "dumbed down" a little bit, keeping only the
information essential so the users can protect themselves. Maybe
something a bit more like below?

SUGGESTION
<caution>
<para>
Altering the <literal>publish_via_partition_root</literal> parameter
can lead to data loss or duplication at the subscriber because it
changes the identity and schema of the published tables.
</para>

This appears precise but lacks the key information that the problem
can happen when a partitioned root table is specified as a replication
target. So, how about one of the following:

* Altering the <literal>publish_via_partition_root</literal> parameter
when the partition root table is specified as the replication target
can lead to data loss or duplication at the subscriber because it
changes the identity and schema of the published tables.

* Altering the <literal>publish_via_partition_root</literal> parameter
can lead to data loss or duplication at the subscriber because it
changes the identity and schema of the published tables. Note this
happens only when the partition root table is specified as the
replication target.

+1 for the second one. (but maybe say "a partition root table" instead
of "the partition root table")

Show quoted text

<para>
This problem can be avoided by refraining from modifying
partition leaf tables
after the <command>ALTER PUBLICATION ... SET</command> until the
<link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... REFRESH PUBLICATION</command></link>
is executed, and by only refreshing using the <literal>copy_data
= off</literal> option.
</para>
</caution>

We can keep this part as you proposed.

--
With Regards,
Amit Kapila.

#19Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Amit Kapila (#17)
RE: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

Dear Amit, Peter,

SUGGESTION
<caution>
<para>
Altering the <literal>publish_via_partition_root</literal> parameter
can lead to data loss or duplication at the subscriber because it
changes the identity and schema of the published tables.
</para>

This appears precise but lacks the key information that the problem
can happen when a partitioned root table is specified as a replication
target. So, how about one of the following:

* Altering the <literal>publish_via_partition_root</literal> parameter
when the partition root table is specified as the replication target
can lead to data loss or duplication at the subscriber because it
changes the identity and schema of the published tables.

* Altering the <literal>publish_via_partition_root</literal> parameter
can lead to data loss or duplication at the subscriber because it
changes the identity and schema of the published tables. Note this
happens only when the partition root table is specified as the
replication target.

<para>
This problem can be avoided by refraining from modifying
partition leaf tables
after the <command>ALTER PUBLICATION ... SET</command> until

the

<link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... REFRESH PUBLICATION</command></link>
is executed, and by only refreshing using the <literal>copy_data
= off</literal> option.
</para>
</caution>

We can keep this part as you proposed.

Thanks for suggestions. I updated as you pointed out. I removed a comma from
"is executed, and..." because my Grammarly said like that.
PSA new version.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Attachments:

v4_add_caution.diffsapplication/octet-stream; name=v4_add_caution.diffsDownload+15-0
#20Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#19)
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.

On Fri, Oct 25, 2024 at 12:21 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Thanks for suggestions. I updated as you pointed out. I removed a comma from
"is executed, and..." because my Grammarly said like that.
PSA new version.

Pushed.

--
With Regards,
Amit Kapila.