[BUG] Unexpected action when publishing partition tables

Started by tanghy.fnst@fujitsu.comover 4 years ago16 messageshackers
Jump to latest
#1tanghy.fnst@fujitsu.com
tanghy.fnst@fujitsu.com

Hi

I met a problem when using logical replication. Maybe it's a bug in logical replication.
When publishing a partition table without replica identity, update
or delete operation can be successful in some cases.

For example:
create table tbl1 (a int) partition by range ( a );
create table tbl1_part1 partition of tbl1 for values from (1) to (101);
create table tbl1_part2 partition of tbl1 for values from (101) to (200);
insert into tbl1 select generate_series(1, 10);
delete from tbl1 where a=1;
create publication pub for table tbl1;
delete from tbl1 where a=2;

The last DELETE statement can be executed successfully, but it should report
error message about missing a replica identity.

I found this problem on HEAD and I could reproduce this problem at PG13 and
PG14. (Logical replication of partition table was introduced in PG13.)

Regards
Tang

#2Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: tanghy.fnst@fujitsu.com (#1)
RE: [BUG] Unexpected action when publishing partition tables
#3Amit Kapila
amit.kapila16@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#2)
Re: [BUG] Unexpected action when publishing partition tables

On Mon, Sep 6, 2021 at 1:49 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

From Mon, Sep 6, 2021 3:59 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:

I met a problem when using logical replication. Maybe it's a bug in logical
replication.
When publishing a partition table without replica identity, update
or delete operation can be successful in some cases.

For example:
create table tbl1 (a int) partition by range ( a );
create table tbl1_part1 partition of tbl1 for values from (1) to (101);
create table tbl1_part2 partition of tbl1 for values from (101) to (200);
insert into tbl1 select generate_series(1, 10);
delete from tbl1 where a=1;
create publication pub for table tbl1;
delete from tbl1 where a=2;

The last DELETE statement can be executed successfully, but it should report
error message about missing a replica identity.

I found this problem on HEAD and I could reproduce this problem at PG13 and
PG14. (Logical replication of partition table was introduced in PG13.)

Adding Amit L and Peter E who were involved in this work (commit:
17b9e7f9) to see if they have opinions on this matter.

I can reproduce this bug.

I think the reason is it didn't invalidate all the leaf partitions' relcache
when add a partitioned table to the publication, so the publication info was
not rebuilt.

The following code only invalidate the target table:
---
PublicationAddTables
publication_add_relation
/* Invalidate relcache so that publication info is rebuilt. */
CacheInvalidateRelcache(targetrel);
---

In addition, this problem can happen in both ADD TABLE, DROP
TABLE, and SET TABLE cases, so we need to invalidate the leaf partitions'
recache in all these cases.

Few comments:
=============
{
@@ -664,7 +673,13 @@ PublicationDropTables(Oid pubid, List *rels, bool
missing_ok)

  ObjectAddressSet(obj, PublicationRelRelationId, prid);
  performDeletion(&obj, DROP_CASCADE, 0);
+
+ relids = GetPubPartitionOptionRelations(relids, PUBLICATION_PART_LEAF,
+ relid);
  }
+
+ /* Invalidate relcache so that publication info is rebuilt. */
+ InvalidatePublicationRels(relids);
 }

We already register the invalidation for the main table in
RemovePublicationRelById which is called via performDeletion. I think
it is better to perform invalidation for partitions at that place.
Similarly is there a reason for not doing invalidations of partitions
in publication_add_relation()?

--
With Regards,
Amit Kapila.

#4Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Amit Kapila (#3)
RE: [BUG] Unexpected action when publishing partition tables

From Tues, Sep 7, 2021 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Sep 6, 2021 at 1:49 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:

I can reproduce this bug.

I think the reason is it didn't invalidate all the leaf partitions'
relcache when add a partitioned table to the publication, so the
publication info was not rebuilt.

The following code only invalidate the target table:
---
PublicationAddTables
publication_add_relation
/* Invalidate relcache so that publication info is rebuilt. */
CacheInvalidateRelcache(targetrel);
---

In addition, this problem can happen in both ADD TABLE, DROP TABLE,
and SET TABLE cases, so we need to invalidate the leaf partitions'
recache in all these cases.

Few comments:
=============
{
@@ -664,7 +673,13 @@ PublicationDropTables(Oid pubid, List *rels, bool
missing_ok)

ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
+
+ relids = GetPubPartitionOptionRelations(relids, PUBLICATION_PART_LEAF,
+ relid);
}
+
+ /* Invalidate relcache so that publication info is rebuilt. */
+ InvalidatePublicationRels(relids);
}

We already register the invalidation for the main table in
RemovePublicationRelById which is called via performDeletion. I think it is
better to perform invalidation for partitions at that place.
Similarly is there a reason for not doing invalidations of partitions in
publication_add_relation()?

Thanks for the comment. I originally intended to reduce the number of invalid
message when add/drop serval tables while each table has lots of partitions which
could exceed the MAX_RELCACHE_INVAL_MSGS. But that seems a rare case, so ,
I changed the code as suggested.

Attach new version patches which addressed the comment.

Best regards,
Hou zj

Attachments:

v2-0001-Made-the-existing-relation-cache-invalidation-an.patchapplication/octet-stream; name=v2-0001-Made-the-existing-relation-cache-invalidation-an.patchDownload+66-48
v2-0002-fix-publication-invalidation.patchapplication/octet-stream; name=v2-0002-fix-publication-invalidation.patchDownload+47-8
#5tanghy.fnst@fujitsu.com
tanghy.fnst@fujitsu.com
In reply to: Zhijie Hou (Fujitsu) (#4)
RE: [BUG] Unexpected action when publishing partition tables

---
PublicationAddTables
publication_add_relation
/* Invalidate relcache so that publication info is rebuilt. */
CacheInvalidateRelcache(targetrel);
---

In addition, this problem can happen in both ADD TABLE, DROP TABLE,
and SET TABLE cases, so we need to invalidate the leaf partitions'
recache in all these cases.

Few comments:
=============
{
@@ -664,7 +673,13 @@ PublicationDropTables(Oid pubid, List *rels, bool
missing_ok)

ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
+
+ relids = GetPubPartitionOptionRelations(relids, PUBLICATION_PART_LEAF,
+ relid);
}
+
+ /* Invalidate relcache so that publication info is rebuilt. */
+ InvalidatePublicationRels(relids);
}

We already register the invalidation for the main table in
RemovePublicationRelById which is called via performDeletion. I think it is
better to perform invalidation for partitions at that place.
Similarly is there a reason for not doing invalidations of partitions in
publication_add_relation()?

Thanks for the comment. I originally intended to reduce the number of invalid
message when add/drop serval tables while each table has lots of partitions which
could exceed the MAX_RELCACHE_INVAL_MSGS. But that seems a rare case, so ,
I changed the code as suggested.

Attach new version patches which addressed the comment.

Thanks for your patch. I confirmed that the problem I reported was fixed.

Besides, Your v2 patch also fixed an existing a problem about "DROP PUBLICATION" on HEAD.
(Publication was dropped but it still reported errors about replica identity when trying to
update or delete a partition table.)

Regards
Tang

#6vignesh C
vignesh21@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#4)
Re: [BUG] Unexpected action when publishing partition tables

On Tue, Sep 7, 2021 at 11:38 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

From Tues, Sep 7, 2021 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Sep 6, 2021 at 1:49 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:

I can reproduce this bug.

I think the reason is it didn't invalidate all the leaf partitions'
relcache when add a partitioned table to the publication, so the
publication info was not rebuilt.

The following code only invalidate the target table:
---
PublicationAddTables
publication_add_relation
/* Invalidate relcache so that publication info is rebuilt. */
CacheInvalidateRelcache(targetrel);
---

In addition, this problem can happen in both ADD TABLE, DROP TABLE,
and SET TABLE cases, so we need to invalidate the leaf partitions'
recache in all these cases.

Few comments:
=============
{
@@ -664,7 +673,13 @@ PublicationDropTables(Oid pubid, List *rels, bool
missing_ok)

ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
+
+ relids = GetPubPartitionOptionRelations(relids, PUBLICATION_PART_LEAF,
+ relid);
}
+
+ /* Invalidate relcache so that publication info is rebuilt. */
+ InvalidatePublicationRels(relids);
}

We already register the invalidation for the main table in
RemovePublicationRelById which is called via performDeletion. I think it is
better to perform invalidation for partitions at that place.
Similarly is there a reason for not doing invalidations of partitions in
publication_add_relation()?

Thanks for the comment. I originally intended to reduce the number of invalid
message when add/drop serval tables while each table has lots of partitions which
could exceed the MAX_RELCACHE_INVAL_MSGS. But that seems a rare case, so ,
I changed the code as suggested.

Attach new version patches which addressed the comment.

Thanks for fixing this issue. The bug gets fixed by the patch, I did
not find any issues in my testing.
I just had one minor comment:

We could clean the table at the end by calling DROP TABLE testpub_parted2:
+-- still fail, because parent's publication replicates updates
+UPDATE testpub_parted2 SET a = 2;
+ERROR:  cannot update table "testpub_parted2" because it does not
have a replica identity and publishes updates
+HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
+-- works again, because update is no longer replicated
+UPDATE testpub_parted2 SET a = 2;

Regards,
Vignesh

#7Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: vignesh C (#6)
RE: [BUG] Unexpected action when publishing partition tables

On Tuesday, September 14, 2021 10:41 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, Sep 7, 2021 at 11:38 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:

Attach new version patches which addressed the comment.

Thanks for fixing this issue. The bug gets fixed by the patch, I did not find any
issues in my testing.
I just had one minor comment:

We could clean the table at the end by calling DROP TABLE testpub_parted2:
+-- still fail, because parent's publication replicates updates UPDATE
+testpub_parted2 SET a = 2;
+ERROR:  cannot update table "testpub_parted2" because it does not
have a replica identity and publishes updates
+HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER
TABLE.
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
+-- works again, because update is no longer replicated UPDATE
+testpub_parted2 SET a = 2;

Thanks for the comment.
Attach new version patches which clean the table at the end.

Best regards,
Hou zj

Attachments:

v3-0001-Made-the-existing-relation-cache-invalidation-an.patchapplication/octet-stream; name=v3-0001-Made-the-existing-relation-cache-invalidation-an.patchDownload+66-48
v3-0002-Invalidate-all-partitions.patchapplication/octet-stream; name=v3-0002-Invalidate-all-partitions.patchDownload+49-10
#8Amit Kapila
amit.kapila16@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#7)
Re: [BUG] Unexpected action when publishing partition tables

On Thu, Sep 16, 2021 at 7:15 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

On Tuesday, September 14, 2021 10:41 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, Sep 7, 2021 at 11:38 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:

Thanks for the comment.
Attach new version patches which clean the table at the end.

+ * For partitioned table contained in the publication, we must
+ * invalidate all partitions contained in the respective partition
+ * trees, not just the one explicitly mentioned in the publication.

Can we slightly change the above comment as: "For the partitioned
tables, we must invalidate all partitions contained in the respective
partition hierarchies, not just the one explicitly mentioned in the
publication. This is required because we implicitly publish the child
tables when the parent table is published."

Apart from this, the patch looks good to me.

I think we need to back-patch this till v13. What do you think? If
yes, then can you please prepare and test the patches for
back-branches? Does anyone else have opinions on back-patching this?

I think this is not a show-stopper bug, so even if we decide to
back-patch, I will do it next week after 14 RC1.

--
With Regards,
Amit Kapila.

#9Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Amit Kapila (#8)
RE: [BUG] Unexpected action when publishing partition tables

On Thursday, September 16, 2021 6:05 PM Amit Kapila <amit.kapila16@gmail.com>

On Tuesday, September 14, 2021 10:41 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, Sep 7, 2021 at 11:38 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:

Thanks for the comment.
Attach new version patches which clean the table at the end.

+ * For partitioned table contained in the publication, we must
+ * invalidate all partitions contained in the respective partition
+ * trees, not just the one explicitly mentioned in the publication.

Can we slightly change the above comment as: "For the partitioned tables, we
must invalidate all partitions contained in the respective partition hierarchies,
not just the one explicitly mentioned in the publication. This is required
because we implicitly publish the child tables when the parent table is
published."

Apart from this, the patch looks good to me.

I think we need to back-patch this till v13. What do you think?

I agreed.

Attach patches for back-branch, each has passed regression tests and pgindent.

Best regards,
Hou zj

Attachments:

HEAD-Invalidate-all-partitions-for-partitioned-table-in-p.patchapplication/octet-stream; name=HEAD-Invalidate-all-partitions-for-partitioned-table-in-p.patchDownload+116-56
PG14-Invalidate-all-partitions-for-partitioned-table-in-p.patchapplication/octet-stream; name=PG14-Invalidate-all-partitions-for-partitioned-table-in-p.patchDownload+116-56
PG13-Invalidate-all-partitions-for-partitioned-table-in-p.patchapplication/octet-stream; name=PG13-Invalidate-all-partitions-for-partitioned-table-in-p.patchDownload+116-56
#10Amit Kapila
amit.kapila16@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#9)
Re: [BUG] Unexpected action when publishing partition tables

On Fri, Sep 17, 2021 at 11:36 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

On Thursday, September 16, 2021 6:05 PM Amit Kapila <amit.kapila16@gmail.com>

On Tuesday, September 14, 2021 10:41 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, Sep 7, 2021 at 11:38 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:

Thanks for the comment.
Attach new version patches which clean the table at the end.

+ * For partitioned table contained in the publication, we must
+ * invalidate all partitions contained in the respective partition
+ * trees, not just the one explicitly mentioned in the publication.

Can we slightly change the above comment as: "For the partitioned tables, we
must invalidate all partitions contained in the respective partition hierarchies,
not just the one explicitly mentioned in the publication. This is required
because we implicitly publish the child tables when the parent table is
published."

Apart from this, the patch looks good to me.

I think we need to back-patch this till v13. What do you think?

I agreed.

Attach patches for back-branch, each has passed regression tests and pgindent.

Thanks, your patches look good to me. I'll push them sometime next
week after Tuesday unless there are any comments.

--
With Regards,
Amit Kapila.

#11Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#10)
Re: [BUG] Unexpected action when publishing partition tables

On Fri, Sep 17, 2021 at 4:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Thanks, your patches look good to me. I'll push them sometime next
week after Tuesday unless there are any comments.

Pushed.

--
With Regards,
Amit Kapila.

#12Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Kapila (#10)
Re: [BUG] Unexpected action when publishing partition tables

On Fri, Sep 17, 2021 at 7:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Sep 17, 2021 at 11:36 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

On Thursday, September 16, 2021 6:05 PM Amit Kapila <amit.kapila16@gmail.com>

On Tuesday, September 14, 2021 10:41 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, Sep 7, 2021 at 11:38 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:

Thanks for the comment.
Attach new version patches which clean the table at the end.

+ * For partitioned table contained in the publication, we must
+ * invalidate all partitions contained in the respective partition
+ * trees, not just the one explicitly mentioned in the publication.

Can we slightly change the above comment as: "For the partitioned tables, we
must invalidate all partitions contained in the respective partition hierarchies,
not just the one explicitly mentioned in the publication. This is required
because we implicitly publish the child tables when the parent table is
published."

Apart from this, the patch looks good to me.

I think we need to back-patch this till v13. What do you think?

I agreed.

Attach patches for back-branch, each has passed regression tests and pgindent.

Thanks, your patches look good to me. I'll push them sometime next
week after Tuesday unless there are any comments.

Thanks Amit, Tang, and Hou for this.

Sorry that I didn't comment on this earlier, but I think either
GetPubPartitionOptionRelations() or InvalidatePublicationRels()
introduced in the commit 4548c76738b should lock the partitions, just
like to the parent partitioned table would be, before invalidating
them. There may be some hazards to invalidating a relation without
locking it.

For example, maybe add a 'lockmode' parameter to
GetPubPartitionOptionRelations() which it passes down to
find_all_inheritors() instead of NoLock as now. And make all sites
except GetPublicationRelations() pass ShareUpdateExclusiveLock for it.
Maybe like the attached.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachments:

lock-publication-partitions-before-inval.patchapplication/octet-stream; name=lock-publication-partitions-before-inval.patchDownload+17-6
#13Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Langote (#12)
Re: [BUG] Unexpected action when publishing partition tables

On Thu, Oct 7, 2021 at 12:39 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Fri, Sep 17, 2021 at 7:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Sep 17, 2021 at 11:36 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

On Thursday, September 16, 2021 6:05 PM Amit Kapila <amit.kapila16@gmail.com>

On Tuesday, September 14, 2021 10:41 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, Sep 7, 2021 at 11:38 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:

Thanks for the comment.
Attach new version patches which clean the table at the end.

+ * For partitioned table contained in the publication, we must
+ * invalidate all partitions contained in the respective partition
+ * trees, not just the one explicitly mentioned in the publication.

Can we slightly change the above comment as: "For the partitioned tables, we
must invalidate all partitions contained in the respective partition hierarchies,
not just the one explicitly mentioned in the publication. This is required
because we implicitly publish the child tables when the parent table is
published."

Apart from this, the patch looks good to me.

I think we need to back-patch this till v13. What do you think?

I agreed.

Attach patches for back-branch, each has passed regression tests and pgindent.

Thanks, your patches look good to me. I'll push them sometime next
week after Tuesday unless there are any comments.

Thanks Amit, Tang, and Hou for this.

Sorry that I didn't comment on this earlier, but I think either
GetPubPartitionOptionRelations() or InvalidatePublicationRels()
introduced in the commit 4548c76738b should lock the partitions, just
like to the parent partitioned table would be, before invalidating
them. There may be some hazards to invalidating a relation without
locking it.

I see your point but then on the same lines didn't the existing code
"for all tables" case (where we call CacheInvalidateRelcacheAll()
without locking all relations) have a similar problem. Also, in your
patch, you are assuming that the callers of GetPublicationRelations()
will lock all the relations but what when it gets called from
AlterPublicationOptions()?

--
With Regards,
Amit Kapila.

#14Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Kapila (#13)
Re: [BUG] Unexpected action when publishing partition tables

Hi Amit,

On Fri, Oct 8, 2021 at 12:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 7, 2021 at 12:39 PM Amit Langote <amitlangote09@gmail.com> wrote:

Sorry that I didn't comment on this earlier, but I think either
GetPubPartitionOptionRelations() or InvalidatePublicationRels()
introduced in the commit 4548c76738b should lock the partitions, just
like to the parent partitioned table would be, before invalidating
them. There may be some hazards to invalidating a relation without
locking it.

I see your point but then on the same lines didn't the existing code
"for all tables" case (where we call CacheInvalidateRelcacheAll()
without locking all relations) have a similar problem.

There might be. I checked to see how other callers/modules use
CacheInvalidateRelcacheAll(), though it seems that only the functions
in publicationcmds.c use it or really was invented in 665d1fad99e for
use by publication commands.

Maybe I need to look harder than I've done for any examples of hazard.

Also, in your
patch, you are assuming that the callers of GetPublicationRelations()
will lock all the relations but what when it gets called from
AlterPublicationOptions()?

Ah, my bad. I hadn't noticed that one for some reason.

Now that you mention it, I do find it somewhat concerning (on the
similar grounds as what prompted my previous email) that
AlterPublicationOptions() does away with any locking on the affected
relations.

Anyway, I'll think a bit more about the possible hazards of not doing
the locking and will reply again if there's indeed a problem(s) that
needs to be fixed.

--
Amit Langote
EDB: http://www.enterprisedb.com

#15Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Langote (#14)
Re: [BUG] Unexpected action when publishing partition tables

On Wed, Oct 13, 2021 at 9:11 AM Amit Langote <amitlangote09@gmail.com> wrote:

Hi Amit,

On Fri, Oct 8, 2021 at 12:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Oct 7, 2021 at 12:39 PM Amit Langote <amitlangote09@gmail.com> wrote:

Sorry that I didn't comment on this earlier, but I think either
GetPubPartitionOptionRelations() or InvalidatePublicationRels()
introduced in the commit 4548c76738b should lock the partitions, just
like to the parent partitioned table would be, before invalidating
them. There may be some hazards to invalidating a relation without
locking it.

I see your point but then on the same lines didn't the existing code
"for all tables" case (where we call CacheInvalidateRelcacheAll()
without locking all relations) have a similar problem.

There might be. I checked to see how other callers/modules use
CacheInvalidateRelcacheAll(), though it seems that only the functions
in publicationcmds.c use it or really was invented in 665d1fad99e for
use by publication commands.

Maybe I need to look harder than I've done for any examples of hazard.

Also, in your
patch, you are assuming that the callers of GetPublicationRelations()
will lock all the relations but what when it gets called from
AlterPublicationOptions()?

Ah, my bad. I hadn't noticed that one for some reason.

Now that you mention it, I do find it somewhat concerning (on the
similar grounds as what prompted my previous email) that
AlterPublicationOptions() does away with any locking on the affected
relations.

Anyway, I'll think a bit more about the possible hazards of not doing
the locking and will reply again if there's indeed a problem(s) that
needs to be fixed.

I think you can try to reproduce the problem via the debugger. You can
stop before calling GetPubPartitionOptionRelations in
publication_add_relation() in session-1 and then from another session
(say session-2) try to delete one of the partition table (without
replica identity). Then stop in session-2 somewhere after acquiring
lock to the corresponding partition relation. Now, continue in
session-1 and invalidate the rels and let it complete the command. I
think session-2 will complete the update without processing the
invalidations.

If the above is true, then, this breaks the following behavior
specified in the documentation: "The tables added to a publication
that publishes UPDATE and/or DELETE operations must have REPLICA
IDENTITY defined. Otherwise, those operations will be disallowed on
those tables.". Also, I think such updates won't be replicated on
subscribers as there is no replica identity or primary key column.

--
With Regards,
Amit Kapila.

#16Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#15)
Re: [BUG] Unexpected action when publishing partition tables

On Mon, Oct 18, 2021 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Oct 13, 2021 at 9:11 AM Amit Langote <amitlangote09@gmail.com> wrote:

Anyway, I'll think a bit more about the possible hazards of not doing
the locking and will reply again if there's indeed a problem(s) that
needs to be fixed.

I think you can try to reproduce the problem via the debugger. You can
stop before calling GetPubPartitionOptionRelations in
publication_add_relation() in session-1 and then from another session
(say session-2) try to delete one of the partition table (without
replica identity). Then stop in session-2 somewhere after acquiring
lock to the corresponding partition relation. Now, continue in
session-1 and invalidate the rels and let it complete the command. I
think session-2 will complete the update without processing the
invalidations.

In the last sentence, it should be delete rather than update.

--
With Regards,
Amit Kapila.