Replica Identity check of partition table on subscriber
Hi hackers,
I saw a problem in logical replication, when the target table on subscriber is a
partitioned table, it only checks whether the Replica Identity of partitioned
table is consistent with the publisher, and doesn't check Replica Identity of
the partition.
For example:
-- publisher --
create table tbl (a int not null, b int);
create unique INDEX ON tbl (a);
alter table tbl replica identity using INDEX tbl_a_idx;
create publication pub for table tbl;
-- subscriber --
-- table tbl (parent table) has RI index, while table child has no RI index.
create table tbl (a int not null, b int) partition by range (a);
create table child partition of tbl default;
create unique INDEX ON tbl (a);
alter table tbl replica identity using INDEX tbl_a_idx;
create subscription sub connection 'port=5432 dbname=postgres' publication pub;
-- publisher --
insert into tbl values (11,11);
update tbl set a=a+1;
It caused an assertion failure on subscriber:
TRAP: FailedAssertion("OidIsValid(idxoid) || (remoterel->replident == REPLICA_IDENTITY_FULL)", File: "worker.c", Line: 2088, PID: 1616523)
The backtrace is attached.
We got the assertion failure because idxoid is invalid, as table child has no
Replica Identity or Primary Key. We have a check in check_relation_updatable(),
but what it checked is table tbl (the parent table) and it passed the check.
I think one approach to fix it is to check the target partition in this case,
instead of the partitioned table.
When trying to fix it, I saw some other problems about updating partition map
cache.
a) In logicalrep_partmap_invalidate_cb(), the type of the entry in
LogicalRepPartMap should be LogicalRepPartMapEntry, instead of
LogicalRepRelMapEntry.
b) In logicalrep_partition_open(), it didn't check if the entry is valid.
c) When the publisher send new relation mapping, only relation map cache will be
updated, and partition map cache wouldn't. I think it also should be updated
because it has remote relation information, too.
Attach two patches which tried to fix them.
0001 patch: fix the above three problems about partition map cache.
0002 patch: fix the assertion failure, check the Replica Identity of the
partition if the target table is a partitioned table.
Thanks to Hou Zhijie for helping me in the first patch.
I will add a test for it later if no one doesn't like this fix.
Regards,
Shi yu
Attachments:
v1-0001-Fix-partition-map-cache-issues.patchapplication/octet-stream; name=v1-0001-Fix-partition-map-cache-issues.patchDownload+81-39
v1-0002-Check-partition-table-replica-identity-on-subscri.patchapplication/octet-stream; name=v1-0002-Check-partition-table-replica-identity-on-subscri.patchDownload+83-59
backtrace.txttext/plain; name=backtrace.txtDownload
On Wed, Jun 8, 2022 at 2:17 PM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
I saw a problem in logical replication, when the target table on subscriber is a
partitioned table, it only checks whether the Replica Identity of partitioned
table is consistent with the publisher, and doesn't check Replica Identity of
the partition.
...
It caused an assertion failure on subscriber:
TRAP: FailedAssertion("OidIsValid(idxoid) || (remoterel->replident == REPLICA_IDENTITY_FULL)", File: "worker.c", Line: 2088, PID: 1616523)The backtrace is attached.
We got the assertion failure because idxoid is invalid, as table child has no
Replica Identity or Primary Key. We have a check in check_relation_updatable(),
but what it checked is table tbl (the parent table) and it passed the check.
I can reproduce the problem. This seems to be the problem since commit
f1ac27bf (Add logical replication support to replicate into
partitioned tables), so adding Amit L. and Peter E.
I think one approach to fix it is to check the target partition in this case,
instead of the partitioned table.
This approach sounds reasonable to me. One minor point:
+/*
+ * Check that replica identity matches.
+ *
+ * We allow for stricter replica identity (fewer columns) on subscriber as
+ * that will not stop us from finding unique tuple. IE, if publisher has
+ * identity (id,timestamp) and subscriber just (id) this will not be a
+ * problem, but in the opposite scenario it will.
+ *
+ * Don't throw any error here just mark the relation entry as not updatable,
+ * as replica identity is only for updates and deletes but inserts can be
+ * replicated even without it.
+ */
+static void
+logicalrep_check_updatable(LogicalRepRelMapEntry *entry)
Can we name this function as logicalrep_rel_mark_updatable as we are
doing that? If so, change the comments as well.
When trying to fix it, I saw some other problems about updating partition map
cache.a) In logicalrep_partmap_invalidate_cb(), the type of the entry in
LogicalRepPartMap should be LogicalRepPartMapEntry, instead of
LogicalRepRelMapEntry.b) In logicalrep_partition_open(), it didn't check if the entry is valid.
c) When the publisher send new relation mapping, only relation map cache will be
updated, and partition map cache wouldn't. I think it also should be updated
because it has remote relation information, too.
Is there any test case that can show the problem due to your above observations?
--
With Regards,
Amit Kapila.
Hi Amit,
On Thu, Jun 9, 2022 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jun 8, 2022 at 2:17 PM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:I saw a problem in logical replication, when the target table on subscriber is a
partitioned table, it only checks whether the Replica Identity of partitioned
table is consistent with the publisher, and doesn't check Replica Identity of
the partition....
It caused an assertion failure on subscriber:
TRAP: FailedAssertion("OidIsValid(idxoid) || (remoterel->replident == REPLICA_IDENTITY_FULL)", File: "worker.c", Line: 2088, PID: 1616523)The backtrace is attached.
We got the assertion failure because idxoid is invalid, as table child has no
Replica Identity or Primary Key. We have a check in check_relation_updatable(),
but what it checked is table tbl (the parent table) and it passed the check.I can reproduce the problem. This seems to be the problem since commit
f1ac27bf (Add logical replication support to replicate into
partitioned tables), so adding Amit L. and Peter E.
Thanks, I can see the problem.
I have looked at other mentioned problems with the code too and agree
they look like bugs.
Both patches look to be on the right track to fix the issues, but will
look more closely to see if I've anything to add.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
On Thu, June 9, 2022 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think one approach to fix it is to check the target partition in this case,
instead of the partitioned table.This approach sounds reasonable to me. One minor point: +/* + * Check that replica identity matches. + * + * We allow for stricter replica identity (fewer columns) on subscriber as + * that will not stop us from finding unique tuple. IE, if publisher has + * identity (id,timestamp) and subscriber just (id) this will not be a + * problem, but in the opposite scenario it will. + * + * Don't throw any error here just mark the relation entry as not updatable, + * as replica identity is only for updates and deletes but inserts can be + * replicated even without it. + */ +static void +logicalrep_check_updatable(LogicalRepRelMapEntry *entry)Can we name this function as logicalrep_rel_mark_updatable as we are
doing that? If so, change the comments as well.
OK. Modified.
When trying to fix it, I saw some other problems about updating partition
map
cache.
a) In logicalrep_partmap_invalidate_cb(), the type of the entry in
LogicalRepPartMap should be LogicalRepPartMapEntry, instead of
LogicalRepRelMapEntry.b) In logicalrep_partition_open(), it didn't check if the entry is valid.
c) When the publisher send new relation mapping, only relation map cache
will be
updated, and partition map cache wouldn't. I think it also should be updated
because it has remote relation information, too.Is there any test case that can show the problem due to your above
observations?
Please see the following case.
-- publisher
create table tbl (a int primary key, b int);
create publication pub for table tbl;
-- subscriber
create table tbl (a int primary key, b int, c int) partition by range (a);
create table child partition of tbl default;
-- publisher, make cache
insert into tbl values (1,1);
update tbl set a=a+1;
alter table tbl add column c int;
update tbl set c=1 where a=2;
-- subscriber
postgres=# select * from tbl;
a | b | c
---+---+---
2 | 1 |
(1 row)
The value of column c updated failed on subscriber.
And after applying the first patch, it would work fine.
I have added this case to the first patch. Also add a test case for the second
patch.
Attach the new patches.
Regards,
Shi yu
Attachments:
v2-0001-Fix-partition-map-cache-issues.patchapplication/octet-stream; name=v2-0001-Fix-partition-map-cache-issues.patchDownload+125-44
v2-0002-Check-partition-table-replica-identity-on-subscri.patchapplication/octet-stream; name=v2-0002-Check-partition-table-replica-identity-on-subscri.patchDownload+97-58
Hello,
On Wed, Jun 8, 2022 at 5:47 PM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
Hi hackers,
I saw a problem in logical replication, when the target table on subscriber is a
partitioned table, it only checks whether the Replica Identity of partitioned
table is consistent with the publisher, and doesn't check Replica Identity of
the partition.For example:
-- publisher --
create table tbl (a int not null, b int);
create unique INDEX ON tbl (a);
alter table tbl replica identity using INDEX tbl_a_idx;
create publication pub for table tbl;-- subscriber --
-- table tbl (parent table) has RI index, while table child has no RI index.
create table tbl (a int not null, b int) partition by range (a);
create table child partition of tbl default;
create unique INDEX ON tbl (a);
alter table tbl replica identity using INDEX tbl_a_idx;
create subscription sub connection 'port=5432 dbname=postgres' publication pub;-- publisher --
insert into tbl values (11,11);
update tbl set a=a+1;It caused an assertion failure on subscriber:
TRAP: FailedAssertion("OidIsValid(idxoid) || (remoterel->replident == REPLICA_IDENTITY_FULL)", File: "worker.c", Line: 2088, PID: 1616523)The backtrace is attached.
We got the assertion failure because idxoid is invalid, as table child has no
Replica Identity or Primary Key. We have a check in check_relation_updatable(),
but what it checked is table tbl (the parent table) and it passed the check.I think one approach to fix it is to check the target partition in this case,
instead of the partitioned table.
That makes sense. A normal user update of a partitioned table will
only perform CheckCmdReplicaIdentity() for leaf partitions and the
logical replication updates should do the same. I may have been
confused at the time to think that ALTER TABLE REPLICA IDENTITY makes
sure that the replica identities of all relations in a partition tree
are forced to be the same at all times, though it seems that the patch
to do so [1]/messages/by-id/201902041630.gpadougzab7v@alvherre.pgsql didn't actually get in. I guess adding a test case would
have helped.
When trying to fix it, I saw some other problems about updating partition map
cache.a) In logicalrep_partmap_invalidate_cb(), the type of the entry in
LogicalRepPartMap should be LogicalRepPartMapEntry, instead of
LogicalRepRelMapEntry.
Indeed.
b) In logicalrep_partition_open(), it didn't check if the entry is valid.
Yeah, that's bad. Actually, it seems that localrelvalid stuff for
LogicalRepRelMapEntry came in 3d65b0593c5, but it's likely we missed
in that commit that LogicalRepPartMapEntrys contain copies of, not
pointers to, the relevant parent's entry. This patch fixes that
oversight.
c) When the publisher send new relation mapping, only relation map cache will be
updated, and partition map cache wouldn't. I think it also should be updated
because it has remote relation information, too.
Yes, again a result of forgetting that the partition map entries have
copies of relation map entries.
+logicalrep_partmap_invalidate
I wonder why not call this logicalrep_partmap_update() to go with
logicalrep_relmap_update()? It seems confusing to have
logicalrep_partmap_invalidate() right next to
logicalrep_partmap_invalidate_cb().
+/*
+ * Invalidate the existing entry in the partition map.
I think logicalrep_partmap_invalidate() may update *multiple* entries,
because the hash table scan may find multiple PartMapEntrys containing
a copy of the RelMapEntry with given remoteid, that is, for multiple
partitions of a given local parent table mapped to that remote
relation. So, please fix the comment as:
Invalidate/Update the entries in the partition map that refer to 'remoterel'
Likewise:
+ /* Invalidate the corresponding partition map as well */
Maybe, this should say:
Also update all entries in the partition map that refer to 'remoterel'.
In 0002:
+logicalrep_check_updatable
+1 to Amit's suggestion to use "mark" instead of "check".
@@ -1735,6 +1735,13 @@ apply_handle_insert_internal(ApplyExecutionData *edata,
static void
check_relation_updatable(LogicalRepRelMapEntry *rel)
{
+ /*
+ * If it is a partitioned table, we don't check it, we will check its
+ * partition later.
+ */
+ if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+ return;
Why do this? I mean why if logicalrep_check_updatable() doesn't care
if the relation is partitioned or not -- it does all the work
regardless.
I suggest we don't add this check in check_relation_updatable().
+ /* Check if we can do the update or delete. */
Maybe mention "leaf partition", as:
Check if we can do the update or delete on the leaf partition.
BTW, the following hunk in patch 0002 should really be a part of 0001.
@@ -584,7 +594,6 @@ logicalrep_partition_open(LogicalRepRelMapEntry *root,
Oid partOid = RelationGetRelid(partrel);
AttrMap *attrmap = root->attrmap;
bool found;
- int i;
MemoryContext oldctx;
if (LogicalRepPartMap == NULL)
Thanks to Hou Zhijie for helping me in the first patch.
Thank you both for the fixes.
I will add a test for it later
That would be very welcome.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
[1]: /messages/by-id/201902041630.gpadougzab7v@alvherre.pgsql
On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> wrote:
+logicalrep_partmap_invalidate
I wonder why not call this logicalrep_partmap_update() to go with
logicalrep_relmap_update()? It seems confusing to have
logicalrep_partmap_invalidate() right next to
logicalrep_partmap_invalidate_cb().
I am thinking about why we need to update the relmap in this new
function logicalrep_partmap_invalidate()? I think it may be better to
do it in logicalrep_partition_open() when actually required,
otherwise, we end up doing a lot of work that may not be of use unless
the corresponding partition is accessed. Also, it seems awkward to me
that we do the same thing in this new function
logicalrep_partmap_invalidate() and then also in
logicalrep_partition_open() under different conditions.
One more point about the 0001, it doesn't seem to have a test that
validates logicalrep_partmap_invalidate_cb() functionality. I think
for that we need to Alter the local table (table on the subscriber
side). Can we try to write a test for it?
--
With Regards,
Amit Kapila.
On Saturday, June 11, 2022 9:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com>
wrote:+logicalrep_partmap_invalidate
I wonder why not call this logicalrep_partmap_update() to go with
logicalrep_relmap_update()? It seems confusing to have
logicalrep_partmap_invalidate() right next to
logicalrep_partmap_invalidate_cb().I am thinking about why we need to update the relmap in this new function
logicalrep_partmap_invalidate()? I think it may be better to do it in
logicalrep_partition_open() when actually required, otherwise, we end up doing
a lot of work that may not be of use unless the corresponding partition is
accessed. Also, it seems awkward to me that we do the same thing in this new
function
logicalrep_partmap_invalidate() and then also in
logicalrep_partition_open() under different conditions.One more point about the 0001, it doesn't seem to have a test that validates
logicalrep_partmap_invalidate_cb() functionality. I think for that we need to Alter
the local table (table on the subscriber side). Can we try to write a test for it?
Thanks for Amit. L and Amit. K for your comments ! I agree with this point.
Here is the version patch set which try to address all these comments.
In addition, when reviewing the code, I found some other related
problems in the code.
1)
entry->attrmap = make_attrmap(map->maplen);
for (attno = 0; attno < entry->attrmap->maplen; attno++)
{
AttrNumber root_attno = map->attnums[attno];
entry->attrmap->attnums[attno] = attrmap->attnums[root_attno - 1];
}
In this case, It’s possible that 'attno' points to a dropped column in which
case the root_attno would be '0'. I think in this case we should just set the
entry->attrmap->attnums[attno] to '-1' instead of accessing the
attrmap->attnums[]. I included this change in 0001 because the testcase which
can reproduce these problems are related(we need to ALTER the partition on
subscriber to reproduce it).
2)
if (entry->attrmap)
pfree(entry->attrmap);
I think we should use free_attrmap instead of pfree here to avoid memory leak.
And we also need to check the attrmap in logicalrep_rel_open() and
logicalrep_partition_open() and free it if needed. I am not sure shall we put this
in the 0001 patch, so attach a separate patch for this. We can merge later this if needed.
Best regards,
Hou zj
Attachments:
v3-0002-Check-partition-table-replica-identity-on-subscriber.patchapplication/octet-stream; name=v3-0002-Check-partition-table-replica-identity-on-subscriber.patchDownload+98-58
v3-0003-fix-memory-leak-about-attrmap.patchapplication/octet-stream; name=v3-0003-fix-memory-leak-about-attrmap.patchDownload+18-2
v3-0001-Fix-partition-map-cache-issues.patchapplication/octet-stream; name=v3-0001-Fix-partition-map-cache-issues.patchDownload+154-45
On Sat, Jun 11, 2022 at 2:36 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Saturday, June 11, 2022 9:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com>
wrote:+logicalrep_partmap_invalidate
I wonder why not call this logicalrep_partmap_update() to go with
logicalrep_relmap_update()? It seems confusing to have
logicalrep_partmap_invalidate() right next to
logicalrep_partmap_invalidate_cb().I am thinking about why we need to update the relmap in this new function
logicalrep_partmap_invalidate()? I think it may be better to do it in
logicalrep_partition_open() when actually required, otherwise, we end up doing
a lot of work that may not be of use unless the corresponding partition is
accessed. Also, it seems awkward to me that we do the same thing in this new
function
logicalrep_partmap_invalidate() and then also in
logicalrep_partition_open() under different conditions.One more point about the 0001, it doesn't seem to have a test that validates
logicalrep_partmap_invalidate_cb() functionality. I think for that we need to Alter
the local table (table on the subscriber side). Can we try to write a test for it?Thanks for Amit. L and Amit. K for your comments ! I agree with this point.
Here is the version patch set which try to address all these comments.In addition, when reviewing the code, I found some other related
problems in the code.1)
entry->attrmap = make_attrmap(map->maplen);
for (attno = 0; attno < entry->attrmap->maplen; attno++)
{
AttrNumber root_attno = map->attnums[attno];entry->attrmap->attnums[attno] = attrmap->attnums[root_attno - 1];
}In this case, It’s possible that 'attno' points to a dropped column in which
case the root_attno would be '0'. I think in this case we should just set the
entry->attrmap->attnums[attno] to '-1' instead of accessing the
attrmap->attnums[]. I included this change in 0001 because the testcase which
can reproduce these problems are related(we need to ALTER the partition on
subscriber to reproduce it).
Hmm, this appears to be a different issue. Can we separate out the
bug-fix code for the subscriber-side issue caused by the DDL on the
subscriber?
Few other comments:
+ * Note that we don't update the remoterel information in the entry here,
+ * we will update the information in logicalrep_partition_open to save
+ * unnecessary work.
+ */
+void
+logicalrep_partmap_invalidate(LogicalRepRelation *remoterel)
/to save/to avoid
Also, I agree with Amit L. that it is confusing to have
logicalrep_partmap_invalidate() right next to
logicalrep_partmap_invalidate_cb() and both have somewhat different
kinds of logic. So, we can either name it as
logicalrep_partmap_reset_relmap() or logicalrep_partmap_update()
unless you have any other better suggestions? Accordingly, change the
comment atop this function.
--
With Regards,
Amit Kapila.
On Monday, June 13, 2022 1:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Jun 11, 2022 at 2:36 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Saturday, June 11, 2022 9:36 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Fri, Jun 10, 2022 at 2:26 PM Amit Langote
<amitlangote09@gmail.com>
wrote:+logicalrep_partmap_invalidate
I wonder why not call this logicalrep_partmap_update() to go with
logicalrep_relmap_update()? It seems confusing to have
logicalrep_partmap_invalidate() right next to
logicalrep_partmap_invalidate_cb().I am thinking about why we need to update the relmap in this new
function logicalrep_partmap_invalidate()? I think it may be better
to do it in
logicalrep_partition_open() when actually required, otherwise, we
end up doing a lot of work that may not be of use unless the
corresponding partition is accessed. Also, it seems awkward to me
that we do the same thing in this new function
logicalrep_partmap_invalidate() and then also in
logicalrep_partition_open() under different conditions.One more point about the 0001, it doesn't seem to have a test that
validates
logicalrep_partmap_invalidate_cb() functionality. I think for that
we need to Alter the local table (table on the subscriber side). Can we try towrite a test for it?
Thanks for Amit. L and Amit. K for your comments ! I agree with this point.
Here is the version patch set which try to address all these comments.In addition, when reviewing the code, I found some other related
problems in the code.1)
entry->attrmap = make_attrmap(map->maplen);
for (attno = 0; attno < entry->attrmap->maplen; attno++)
{
AttrNumber root_attno =map->attnums[attno];
entry->attrmap->attnums[attno] =
attrmap->attnums[root_attno - 1];
}
In this case, It’s possible that 'attno' points to a dropped column in
which case the root_attno would be '0'. I think in this case we should
just set the
entry->attrmap->attnums[attno] to '-1' instead of accessing the
attrmap->attnums[]. I included this change in 0001 because the
attrmap->testcase which
can reproduce these problems are related(we need to ALTER the
partition on subscriber to reproduce it).Hmm, this appears to be a different issue. Can we separate out the bug-fix
code for the subscriber-side issue caused by the DDL on the subscriber?Few other comments: + * Note that we don't update the remoterel information in the entry +here, + * we will update the information in logicalrep_partition_open to save + * unnecessary work. + */ +void +logicalrep_partmap_invalidate(LogicalRepRelation *remoterel)/to save/to avoid
Also, I agree with Amit L. that it is confusing to have
logicalrep_partmap_invalidate() right next to
logicalrep_partmap_invalidate_cb() and both have somewhat different kinds of
logic. So, we can either name it as
logicalrep_partmap_reset_relmap() or logicalrep_partmap_update() unless you
have any other better suggestions? Accordingly, change the comment atop
this function.
Thanks for the comments.
I have separated out the bug-fix for the subscriber-side.
And fix the typo and function name.
Attach the new version patch set.
Best regards,
Hou zj
Attachments:
v4-0004-fix-memory-leak-about-attrmap.patchapplication/octet-stream; name=v4-0004-fix-memory-leak-about-attrmap.patchDownload+18-2
v4-0001-Fix-partition-map-cache-invalidation-on-subscriber.patchapplication/octet-stream; name=v4-0001-Fix-partition-map-cache-invalidation-on-subscriber.patchDownload+90-30
v4-0003-Check-partition-table-replica-identity-on-subscriber.patchapplication/octet-stream; name=v4-0003-Check-partition-table-replica-identity-on-subscriber.patchDownload+98-58
v4-0002-Reset-partition-map-cache-when-receiving-new-relatio.patchapplication/octet-stream; name=v4-0002-Reset-partition-map-cache-when-receiving-new-relatio.patchDownload+54-1
On Sat, Jun 11, 2022 at 10:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> wrote:
+logicalrep_partmap_invalidate
I wonder why not call this logicalrep_partmap_update() to go with
logicalrep_relmap_update()? It seems confusing to have
logicalrep_partmap_invalidate() right next to
logicalrep_partmap_invalidate_cb().I am thinking about why we need to update the relmap in this new
function logicalrep_partmap_invalidate()? I think it may be better to
do it in logicalrep_partition_open() when actually required,
otherwise, we end up doing a lot of work that may not be of use unless
the corresponding partition is accessed. Also, it seems awkward to me
that we do the same thing in this new function
logicalrep_partmap_invalidate() and then also in
logicalrep_partition_open() under different conditions.
Both logicalrep_rel_open() and logicalrel_partition_open() only ever
touch the local Relation, never the LogicalRepRelation. Updating the
latter is the responsibility of logicalrep_relmap_update(), which is
there to support handling of the RELATION message by
apply_handle_relation(). Given that we make a separate copy of the
parent's LogicalRepRelMapEntry for each partition to put into the
corresponding LogicalRepPartMapEntry, those copies must be updated as
well when a RELATION message targeting the parent's entry arrives. So
it seems fine that the patch is making it the
logicalrep_relmap_update()'s responsibility to update the partition
copies using the new logicalrep_partition_invalidate/update()
subroutine.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
On Mon, Jun 13, 2022 at 2:20 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Sat, Jun 11, 2022 at 10:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> wrote:
+logicalrep_partmap_invalidate
I wonder why not call this logicalrep_partmap_update() to go with
logicalrep_relmap_update()? It seems confusing to have
logicalrep_partmap_invalidate() right next to
logicalrep_partmap_invalidate_cb().I am thinking about why we need to update the relmap in this new
function logicalrep_partmap_invalidate()? I think it may be better to
do it in logicalrep_partition_open() when actually required,
otherwise, we end up doing a lot of work that may not be of use unless
the corresponding partition is accessed. Also, it seems awkward to me
that we do the same thing in this new function
logicalrep_partmap_invalidate() and then also in
logicalrep_partition_open() under different conditions.Both logicalrep_rel_open() and logicalrel_partition_open() only ever
touch the local Relation, never the LogicalRepRelation.
We do make the copy of remote rel in logicalrel_partition_open() when
the entry is not found. I feel the same should happen when remote
relation is reset/invalidated by the RELATION message.
Updating the
latter is the responsibility of logicalrep_relmap_update(), which is
there to support handling of the RELATION message by
apply_handle_relation(). Given that we make a separate copy of the
parent's LogicalRepRelMapEntry for each partition to put into the
corresponding LogicalRepPartMapEntry, those copies must be updated as
well when a RELATION message targeting the parent's entry arrives. So
it seems fine that the patch is making it the
logicalrep_relmap_update()'s responsibility to update the partition
copies using the new logicalrep_partition_invalidate/update()
subroutine.
I think we can do that way as well but do you see any benefit in it?
The way I am suggesting will avoid the effort of updating the remote
rel copy till we try to access that particular partition.
--
With Regards,
Amit Kapila.
On Mon, Jun 13, 2022 at 1:03 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Monday, June 13, 2022 1:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I have separated out the bug-fix for the subscriber-side.
And fix the typo and function name.
Attach the new version patch set.
The first patch looks good to me. I have slightly modified one of the
comments and the commit message. I think we need to backpatch this
through 13 where we introduced support to replicate into partitioned
tables (commit f1ac27bf). If you guys are fine, I'll push this once
the work for PG14.4 is done.
--
With Regards,
Amit Kapila.
Attachments:
v5-0001-Fix-cache-look-up-failures-while-applying-changes.patchapplication/octet-stream; name=v5-0001-Fix-cache-look-up-failures-while-applying-changes.patchDownload+90-30
On Mon, Jun 13, 2022 at 9:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jun 13, 2022 at 1:03 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Monday, June 13, 2022 1:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I have separated out the bug-fix for the subscriber-side.
And fix the typo and function name.
Attach the new version patch set.The first patch looks good to me. I have slightly modified one of the
comments and the commit message. I think we need to backpatch this
through 13 where we introduced support to replicate into partitioned
tables (commit f1ac27bf). If you guys are fine, I'll push this once
the work for PG14.4 is done.
Both the code changes and test cases look good to me. Just a couple
of minor nitpicks with test changes:
+ CREATE UNIQUE INDEX tab5_a_idx ON tab5 (a);
+ ALTER TABLE tab5 REPLICA IDENTITY USING INDEX tab5_a_idx;
+ ALTER TABLE tab5_1 REPLICA IDENTITY USING INDEX tab5_1_a_idx;});
Not sure if we follow it elsewhere, but should we maybe avoid using
the internally generated index name as in the partition's case above?
+# Test the case that target table on subscriber is a partitioned table and
+# check that the changes are replicated correctly after changing the schema of
+# table on subcriber.
The first sentence makes it sound like the tests that follow are the
first ones in the file where the target table is partitioned, which is
not true, so I think we should drop that part. Also how about being
more specific about the test intent, say:
Test that replication continues to work correctly after altering the
partition of a partitioned target table.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
On Mon, Jun 13, 2022 at 6:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jun 13, 2022 at 2:20 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Sat, Jun 11, 2022 at 10:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> wrote:
+logicalrep_partmap_invalidate
I wonder why not call this logicalrep_partmap_update() to go with
logicalrep_relmap_update()? It seems confusing to have
logicalrep_partmap_invalidate() right next to
logicalrep_partmap_invalidate_cb().I am thinking about why we need to update the relmap in this new
function logicalrep_partmap_invalidate()? I think it may be better to
do it in logicalrep_partition_open() when actually required,
otherwise, we end up doing a lot of work that may not be of use unless
the corresponding partition is accessed. Also, it seems awkward to me
that we do the same thing in this new function
logicalrep_partmap_invalidate() and then also in
logicalrep_partition_open() under different conditions.Both logicalrep_rel_open() and logicalrel_partition_open() only ever
touch the local Relation, never the LogicalRepRelation.We do make the copy of remote rel in logicalrel_partition_open() when
the entry is not found. I feel the same should happen when remote
relation is reset/invalidated by the RELATION message.
Hmm, the problem is that a RELATION message will only invalidate the
LogicalRepRelation portion of the target parent's entry, while any
copies that have been made for partitions that were opened till that
point will continue to have the old LogicalRepRelation information.
As things stand, logicalrep_partition_open() won't know that the
parent entry's LogicalRepRelation may have been modified due to a
RELATION message. It will reconstruct the entry only if the partition
itself was modified locally, that is, if
logicalrep_partman_invalidate_cb() was called on the partition.
Updating the
latter is the responsibility of logicalrep_relmap_update(), which is
there to support handling of the RELATION message by
apply_handle_relation(). Given that we make a separate copy of the
parent's LogicalRepRelMapEntry for each partition to put into the
corresponding LogicalRepPartMapEntry, those copies must be updated as
well when a RELATION message targeting the parent's entry arrives. So
it seems fine that the patch is making it the
logicalrep_relmap_update()'s responsibility to update the partition
copies using the new logicalrep_partition_invalidate/update()
subroutine.I think we can do that way as well but do you see any benefit in it?
The way I am suggesting will avoid the effort of updating the remote
rel copy till we try to access that particular partition.
I don't see any benefit as such to doing it the way the patch does,
it's just that that seems to be the only way to go given the way
things are.
This would have been unnecessary, for example, if the relation map
entry had contained a LogicalRepRelation pointer instead of the
struct. The partition entries would point to the same entry as the
parent's if that were the case and there would be no need to modify
the partitions' copies explicitly.
Am I missing something?
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
On Tue, Jun 14, 2022 at 3:31 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Mon, Jun 13, 2022 at 6:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think we can do that way as well but do you see any benefit in it?
The way I am suggesting will avoid the effort of updating the remote
rel copy till we try to access that particular partition.I don't see any benefit as such to doing it the way the patch does,
it's just that that seems to be the only way to go given the way
things are.
Oh, I see that v4-0002 has this:
+/*
+ * Reset the entries in the partition map that refer to remoterel
+ *
+ * Called when new relation mapping is sent by the publisher to update our
+ * expected view of incoming data from said publisher.
+ *
+ * Note that we don't update the remoterel information in the entry here,
+ * we will update the information in logicalrep_partition_open to avoid
+ * unnecessary work.
+ */
+void
+logicalrep_partmap_reset_relmap(LogicalRepRelation *remoterel)
+{
+ HASH_SEQ_STATUS status;
+ LogicalRepPartMapEntry *part_entry;
+ LogicalRepRelMapEntry *entry;
+
+ if (LogicalRepPartMap == NULL)
+ return;
+
+ hash_seq_init(&status, LogicalRepPartMap);
+ while ((part_entry = (LogicalRepPartMapEntry *)
hash_seq_search(&status)) != NULL)
+ {
+ entry = &part_entry->relmapentry;
+
+ if (entry->remoterel.remoteid != remoterel->remoteid)
+ continue;
+
+ logicalrep_relmap_free_entry(entry);
+
+ memset(entry, 0, sizeof(LogicalRepRelMapEntry));
+ }
+}
The previous versions would also call logicalrep_relmap_update() on
the entry after the memset, which is no longer done, so that is indeed
saving useless work. I also see that both logicalrep_relmap_update()
and the above function basically invalidate the whole
LogicalRepRelMapEntry before setting the new remote relation info so
that the next logicaprep_rel_open() or logicalrep_partition_open()
have to refill the other members too.
Though, I thought maybe you were saying that we shouldn't need this
function for resetting partitions in the first place, which I guess
you weren't.
v4-0002 looks good btw, except the bitpick about test comment similar
to my earlier comment regarding v5-0001:
+# Change the column order of table on publisher
I think it might be better to say something specific to describe the
test intent, like:
Test that replication into partitioned target table continues to works
correctly when the published table is altered
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
On Tue, Jun 14, 2022 2:18 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Mon, Jun 13, 2022 at 9:26 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Mon, Jun 13, 2022 at 1:03 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Monday, June 13, 2022 1:53 PM Amit Kapila
<amit.kapila16@gmail.com> wrote:
I have separated out the bug-fix for the subscriber-side.
And fix the typo and function name.
Attach the new version patch set.The first patch looks good to me. I have slightly modified one of the
comments and the commit message. I think we need to backpatch this
through 13 where we introduced support to replicate into partitioned
tables (commit f1ac27bf). If you guys are fine, I'll push this once
the work for PG14.4 is done.Both the code changes and test cases look good to me. Just a couple
of minor nitpicks with test changes:
Thanks for your comments.
+ CREATE UNIQUE INDEX tab5_a_idx ON tab5 (a); + ALTER TABLE tab5 REPLICA IDENTITY USING INDEX tab5_a_idx; + ALTER TABLE tab5_1 REPLICA IDENTITY USING INDEX tab5_1_a_idx;});Not sure if we follow it elsewhere, but should we maybe avoid using
the internally generated index name as in the partition's case above?
I saw some existing tests also use internally generated index name (e.g.
replica_identity.sql, ddl.sql and 031_column_list.pl), so maybe it's better to
fix them all in a separate patch. I didn't change this.
+# Test the case that target table on subscriber is a partitioned table and +# check that the changes are replicated correctly after changing the schema of +# table on subcriber.The first sentence makes it sound like the tests that follow are the
first ones in the file where the target table is partitioned, which is
not true, so I think we should drop that part. Also how about being
more specific about the test intent, say:Test that replication continues to work correctly after altering the
partition of a partitioned target table.
OK, modified.
Attached the new version of patch set, and the patches for pg14 and pg13.
Regards,
Shi yu
Attachments:
v6-0001-Fix-cache-look-up-failures-while-applying-changes.patchapplication/octet-stream; name=v6-0001-Fix-cache-look-up-failures-while-applying-changes.patchDownload+89-30
v6-0002-Reset-partition-map-cache-when-receiving-new-rela.patchapplication/octet-stream; name=v6-0002-Reset-partition-map-cache-when-receiving-new-rela.patchDownload+54-1
v6-0003-Check-partition-table-replica-identity-on-subscri.patchapplication/octet-stream; name=v6-0003-Check-partition-table-replica-identity-on-subscri.patchDownload+98-58
v6-0004-fix-memory-leak-about-attrmap.patchapplication/octet-stream; name=v6-0004-fix-memory-leak-about-attrmap.patchDownload+18-2
v6-pg13-0001-Fix-cache-look-up-failures-while-applying-chang_patchapplication/octet-stream; name=v6-pg13-0001-Fix-cache-look-up-failures-while-applying-chang_patchDownload+90-31
v6-pg14-0001-Fix-cache-look-up-failures-while-applying-chang_patchapplication/octet-stream; name=v6-pg14-0001-Fix-cache-look-up-failures-while-applying-chang_patchDownload+90-31
On Tue, Jun 14, 2022 at 1:02 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Tue, Jun 14, 2022 at 3:31 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Mon, Jun 13, 2022 at 6:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think we can do that way as well but do you see any benefit in it?
The way I am suggesting will avoid the effort of updating the remote
rel copy till we try to access that particular partition.I don't see any benefit as such to doing it the way the patch does,
it's just that that seems to be the only way to go given the way
things are.Oh, I see that v4-0002 has this:
+/* + * Reset the entries in the partition map that refer to remoterel + * + * Called when new relation mapping is sent by the publisher to update our + * expected view of incoming data from said publisher. + * + * Note that we don't update the remoterel information in the entry here, + * we will update the information in logicalrep_partition_open to avoid + * unnecessary work. + */ +void +logicalrep_partmap_reset_relmap(LogicalRepRelation *remoterel) +{ + HASH_SEQ_STATUS status; + LogicalRepPartMapEntry *part_entry; + LogicalRepRelMapEntry *entry; + + if (LogicalRepPartMap == NULL) + return; + + hash_seq_init(&status, LogicalRepPartMap); + while ((part_entry = (LogicalRepPartMapEntry *) hash_seq_search(&status)) != NULL) + { + entry = &part_entry->relmapentry; + + if (entry->remoterel.remoteid != remoterel->remoteid) + continue; + + logicalrep_relmap_free_entry(entry); + + memset(entry, 0, sizeof(LogicalRepRelMapEntry)); + } +}The previous versions would also call logicalrep_relmap_update() on
the entry after the memset, which is no longer done, so that is indeed
saving useless work. I also see that both logicalrep_relmap_update()
and the above function basically invalidate the whole
LogicalRepRelMapEntry before setting the new remote relation info so
that the next logicaprep_rel_open() or logicalrep_partition_open()
have to refill the other members too.Though, I thought maybe you were saying that we shouldn't need this
function for resetting partitions in the first place, which I guess
you weren't.
Right.
v4-0002 looks good btw, except the bitpick about test comment similar
to my earlier comment regarding v5-0001:+# Change the column order of table on publisher
I think it might be better to say something specific to describe the
test intent, like:Test that replication into partitioned target table continues to works
correctly when the published table is altered
Okay changed this and slightly modify the comments and commit message.
I am just attaching the HEAD patches for the first two issues.
--
With Regards,
Amit Kapila.
Attachments:
v7-0001-Fix-cache-look-up-failures-while-applying-changes.patchapplication/octet-stream; name=v7-0001-Fix-cache-look-up-failures-while-applying-changes.patchDownload+89-30
v7-0002-Fix-data-inconsistency-between-publisher-and-subs.patchapplication/octet-stream; name=v7-0002-Fix-data-inconsistency-between-publisher-and-subs.patchDownload+53-1
On Tue, Jun 14, 2022 at 9:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 14, 2022 at 1:02 PM Amit Langote <amitlangote09@gmail.com> wrote:
+# Change the column order of table on publisher
I think it might be better to say something specific to describe the
test intent, like:Test that replication into partitioned target table continues to works
correctly when the published table is alteredOkay changed this and slightly modify the comments and commit message.
I am just attaching the HEAD patches for the first two issues.
LGTM, thanks.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
On Tue, Jun 14, 2022 8:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
v4-0002 looks good btw, except the bitpick about test comment similar
to my earlier comment regarding v5-0001:+# Change the column order of table on publisher
I think it might be better to say something specific to describe the
test intent, like:Test that replication into partitioned target table continues to works
correctly when the published table is alteredOkay changed this and slightly modify the comments and commit message.
I am just attaching the HEAD patches for the first two issues.
Thanks for updating the patch.
Attached the new patch set which ran pgindent, and the patches for pg14 and
pg13. (Only the first two patches of the patch set.)
Regards,
Shi yu
Attachments:
v8-pg14-0002-Fix-data-inconsistency-between-publisher-and-su_patchapplication/octet-stream; name=v8-pg14-0002-Fix-data-inconsistency-between-publisher-and-su_patchDownload+54-2
v8-pg14-0001-Fix-cache-look-up-failures-while-applying-chang_patchapplication/octet-stream; name=v8-pg14-0001-Fix-cache-look-up-failures-while-applying-chang_patchDownload+90-31
v8-pg13-0002-Fix-data-inconsistency-between-publisher-and-su_patchapplication/octet-stream; name=v8-pg13-0002-Fix-data-inconsistency-between-publisher-and-su_patchDownload+54-2
v8-pg13-0001-Fix-cache-look-up-failures-while-applying-chang_patchapplication/octet-stream; name=v8-pg13-0001-Fix-cache-look-up-failures-while-applying-chang_patchDownload+90-31
v8-0002-Fix-data-inconsistency-between-publisher-and-subs.patchapplication/octet-stream; name=v8-0002-Fix-data-inconsistency-between-publisher-and-subs.patchDownload+53-1
v8-0001-Fix-cache-look-up-failures-while-applying-changes.patchapplication/octet-stream; name=v8-0001-Fix-cache-look-up-failures-while-applying-changes.patchDownload+89-30
On Wed, Jun 15, 2022 at 8:52 AM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
On Tue, Jun 14, 2022 8:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
v4-0002 looks good btw, except the bitpick about test comment similar
to my earlier comment regarding v5-0001:+# Change the column order of table on publisher
I think it might be better to say something specific to describe the
test intent, like:Test that replication into partitioned target table continues to works
correctly when the published table is alteredOkay changed this and slightly modify the comments and commit message.
I am just attaching the HEAD patches for the first two issues.Thanks for updating the patch.
Attached the new patch set which ran pgindent, and the patches for pg14 and
pg13. (Only the first two patches of the patch set.)
I have pushed the first bug-fix patch today.
--
With Regards,
Amit Kapila.