Restore replication settings when modifying a field type

Started by Quan Zongliangover 6 years ago11 messageshackers
Jump to latest
#1Quan Zongliang
quanzongliang@gmail.com

When the user modifies the REPLICA IDENTIFY field type, the logical
replication settings are lost.

For example:

postgres=# \d+ t1
Table "public.t1"
Column | Type | Collation | Nullable | Default | Storage | Stats
target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
col1 | integer | | | | plain |
|
col2 | integer | | not null | | plain |
|
Indexes:
"t1_col2_key" UNIQUE CONSTRAINT, btree (col2) REPLICA IDENTITY

postgres=# alter table t1 alter col2 type smallint;
ALTER TABLE
postgres=# \d+ t1
Table "public.t1"
Column | Type | Collation | Nullable | Default | Storage | Stats
target | Description
--------+----------+-----------+----------+---------+---------+--------------+-------------
col1 | integer | | | | plain |
|
col2 | smallint | | not null | | plain |
|
Indexes:
"t1_col2_key" UNIQUE CONSTRAINT, btree (col2)

In fact, the replication property of the table has not been modified,
and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously
specified index property 'indisreplident' is set to false because of the
rebuild.

So I developed a patch. If the user modifies the field type. The
associated index is REPLICA IDENTITY. Rebuild and restore replication
settings.

Regards,
Quan Zongliang

Attachments:

replidfieldtype.difftext/plain; charset=UTF-8; name=replidfieldtype.diff; x-mac-creator=0; x-mac-type=0Download+90-2
#2Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Quan Zongliang (#1)
Re: Restore replication settings when modifying a field type

Hello.

# The patch no longer applies on the current master. Needs a rebasing.

At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang <quanzongliang@gmail.com> wrote in

In fact, the replication property of the table has not been modified,
and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously
specified index property 'indisreplident' is set to false because of
the rebuild.

I suppose that the behavior is intended. Change of column types on the
publisher side can break the agreement on replica identity with
subscribers. Thus replica identity setting cannot be restored
unconditionally. For (somewhat artifitial :p) example:

P=# create table t (c1 integer, c2 text unique not null);
P=# alter table t replica identity using index t_c2_key;
P=# create publication p1 for table t;
P=# insert into t values (0, '00'), (1, '01');
S=# create table t (c1 integer, c2 text unique not null);
S=# alter table t replica identity using index t_c2_key;
S=# create subscription s1 connection '...' publication p1;

Your patch allows change of the type of c2 into integer.

P=# alter table t alter column c2 type integer using c2::integer;
P=# update t set c1 = c1 + 1 where c2 = '01';

This change doesn't affect perhaps as expected.

S=# select * from t;
c1 | c2
----+----
0 | 00
1 | 01
(2 rows)

So I developed a patch. If the user modifies the field type. The
associated index is REPLICA IDENTITY. Rebuild and restore replication
settings.

Explicit setting of replica identity premises that they are sure that
the setting works correctly. Implicit rebuilding after a type change
can silently break it.

At least we need to guarantee that the restored replica identity
setting is truly compatible with all existing subscribers. I'm not
sure about potential subscribers..

Anyway I think it is a problem that replica identity setting is
dropped silently. Perhaps a message something like "REPLICA IDENTITY
setting is lost, please redefine after confirmation of compatibility
with subscribers." is needed.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#3Quan Zongliang
quanzongliang@gmail.com
In reply to: Kyotaro Horiguchi (#2)
Re: Restore replication settings when modifying a field type

On 2019/10/28 12:39, Kyotaro Horiguchi wrote:

Hello.

# The patch no longer applies on the current master. Needs a rebasing.

At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang <quanzongliang@gmail.com> wrote in

In fact, the replication property of the table has not been modified,
and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously
specified index property 'indisreplident' is set to false because of
the rebuild.

I suppose that the behavior is intended. Change of column types on the
publisher side can break the agreement on replica identity with
subscribers. Thus replica identity setting cannot be restored
unconditionally. For (somewhat artifitial :p) example:

P=# create table t (c1 integer, c2 text unique not null);
P=# alter table t replica identity using index t_c2_key;
P=# create publication p1 for table t;
P=# insert into t values (0, '00'), (1, '01');
S=# create table t (c1 integer, c2 text unique not null);
S=# alter table t replica identity using index t_c2_key;
S=# create subscription s1 connection '...' publication p1;

Your patch allows change of the type of c2 into integer.

P=# alter table t alter column c2 type integer using c2::integer;
P=# update t set c1 = c1 + 1 where c2 = '01';

This change doesn't affect perhaps as expected.

S=# select * from t;
c1 | c2
----+----
0 | 00
1 | 01
(2 rows)

So I developed a patch. If the user modifies the field type. The
associated index is REPLICA IDENTITY. Rebuild and restore replication
settings.

Explicit setting of replica identity premises that they are sure that
the setting works correctly. Implicit rebuilding after a type change
can silently break it.

At least we need to guarantee that the restored replica identity
setting is truly compatible with all existing subscribers. I'm not
sure about potential subscribers..

Anyway I think it is a problem that replica identity setting is
dropped silently. Perhaps a message something like "REPLICA IDENTITY
setting is lost, please redefine after confirmation of compatibility
with subscribers." is needed.

In fact, the scene we encountered is like this. The field of a user's
table is of type "smallint", and it turns out that this range is not
sufficient. So change it to "int". At this point, the REPLICA IDENTITY
is lost and the user does not realize it. When they found out, the
logical replication for this period of time did not output normally.
Users have to find other ways to get the data back.
The logical replication of this user is to issue standard SQL statements
to other relational databases using the plugin developed by himself. And
they have thousands of tables to replicate.
So I think this patch is appropriate in this scenario. As for the
matching problem between publishers and subscribers, I'm afraid it's
hard to solve here. If this is not a suitable modification, I can
withdraw it. And see if there's a better way.

If necessary, I'll modify it again. Rebase to the master branch.

Show quoted text

regards.

In reply to: Kyotaro Horiguchi (#2)
Re: Restore replication settings when modifying a field type

Em seg, 28 de out de 2019 às 01:41, Kyotaro Horiguchi
<horikyota.ntt@gmail.com> escreveu:

At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang <quanzongliang@gmail.com> wrote in

In fact, the replication property of the table has not been modified,
and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously
specified index property 'indisreplident' is set to false because of
the rebuild.

I suppose that the behavior is intended. Change of column types on the
publisher side can break the agreement on replica identity with
subscribers. Thus replica identity setting cannot be restored
unconditionally. For (somewhat artifitial :p) example:

I don't think so. The actual logical replication behavior is that DDL
will always break replication. If you add a new column or drop a
column, you will stop replication for that table while you don't
execute the same DDL in the subscriber. What happens in the OP case is
that a DDL is *silently* breaking the logical replication. IMHO it is
a bug. If the behavior was intended it should clean
pg_class.relreplident but it is not.

Explicit setting of replica identity premises that they are sure that
the setting works correctly. Implicit rebuilding after a type change
can silently break it.

The current behavior forces the OP to execute 2 DDLs in the same
transaction to ensure that it won't "loose" transactions for logical
replication.

At least we need to guarantee that the restored replica identity
setting is truly compatible with all existing subscribers. I'm not
sure about potential subscribers..

Why? Replication will break and to fix it you should apply the same
DDL you apply in publisher. It is the right thing to do.

[poking the code...]

ATExecAlterColumnType records everything that depends on the column
and for indexes it saves the definition (via pg_get_indexdef_string).
Definition is not sufficient for reconstructing the replica identity
information because there is not such keyword for replica identity in
CREATE INDEX. The new index should call relation_mark_replica_identity
to fix pg_index.indisreplident.

--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

#5Quan Zongliang
quanzongliang@gmail.com
In reply to: Kyotaro Horiguchi (#2)
Re: Restore replication settings when modifying a field type

On 2019/10/28 12:39, Kyotaro Horiguchi wrote:

Hello.

# The patch no longer applies on the current master. Needs a rebasing.

New patch, rebased on master branch.

Show quoted text

At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang <quanzongliang@gmail.com> wrote in

In fact, the replication property of the table has not been modified,
and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously
specified index property 'indisreplident' is set to false because of
the rebuild.

I suppose that the behavior is intended. Change of column types on the
publisher side can break the agreement on replica identity with
subscribers. Thus replica identity setting cannot be restored
unconditionally. For (somewhat artifitial :p) example:

P=# create table t (c1 integer, c2 text unique not null);
P=# alter table t replica identity using index t_c2_key;
P=# create publication p1 for table t;
P=# insert into t values (0, '00'), (1, '01');
S=# create table t (c1 integer, c2 text unique not null);
S=# alter table t replica identity using index t_c2_key;
S=# create subscription s1 connection '...' publication p1;

Your patch allows change of the type of c2 into integer.

P=# alter table t alter column c2 type integer using c2::integer;
P=# update t set c1 = c1 + 1 where c2 = '01';

This change doesn't affect perhaps as expected.

S=# select * from t;
c1 | c2
----+----
0 | 00
1 | 01
(2 rows)

So I developed a patch. If the user modifies the field type. The
associated index is REPLICA IDENTITY. Rebuild and restore replication
settings.

Explicit setting of replica identity premises that they are sure that
the setting works correctly. Implicit rebuilding after a type change
can silently break it.

At least we need to guarantee that the restored replica identity
setting is truly compatible with all existing subscribers. I'm not
sure about potential subscribers..

Anyway I think it is a problem that replica identity setting is
dropped silently. Perhaps a message something like "REPLICA IDENTITY
setting is lost, please redefine after confirmation of compatibility
with subscribers." is needed.

regards.

Attachments:

replidfieldtype.patchtext/plain; charset=UTF-8; name=replidfieldtype.patch; x-mac-creator=0; x-mac-type=0Download+102-3
#6Peter Eisentraut
peter_e@gmx.net
In reply to: Euler Taveira de Oliveira (#4)
Re: Restore replication settings when modifying a field type

On 2019-11-01 04:39, Euler Taveira wrote:

ATExecAlterColumnType records everything that depends on the column
and for indexes it saves the definition (via pg_get_indexdef_string).
Definition is not sufficient for reconstructing the replica identity
information because there is not such keyword for replica identity in
CREATE INDEX. The new index should call relation_mark_replica_identity
to fix pg_index.indisreplident.

Yeah, I don't think we need to do the full dance of reverse compiling
the SQL command and reexecuting it, as the patch currently does. That's
only necessary for rebuilding the index itself. For re-setting the
replica identity, we can just use the internal API as you say.

Also, a few test cases would be nice for this patch.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Quan Zongliang
quanzongliang@foxmail.com
In reply to: Peter Eisentraut (#6)
Re: Restore replication settings when modifying a field type

On 2020/1/3 17:14, Peter Eisentraut wrote:

On 2019-11-01 04:39, Euler Taveira wrote:

ATExecAlterColumnType records everything that depends on the column
and for indexes it saves the definition (via pg_get_indexdef_string).
Definition is not sufficient for reconstructing the replica identity
information because there is not such keyword for replica identity in
CREATE INDEX. The new index should call relation_mark_replica_identity
to fix pg_index.indisreplident.

Yeah, I don't think we need to do the full dance of reverse compiling
the SQL command and reexecuting it, as the patch currently does.  That's
only necessary for rebuilding the index itself.  For re-setting the
replica identity, we can just use the internal API as you say.

Also, a few test cases would be nice for this patch.

I'm a little busy. I'll write a new patch in a few days.

#8Quan Zongliang
quanzongliang@foxmail.com
In reply to: Quan Zongliang (#7)
Re: Restore replication settings when modifying a field type

On 2020/1/15 08:30, Quan Zongliang wrote:

On 2020/1/3 17:14, Peter Eisentraut wrote:

On 2019-11-01 04:39, Euler Taveira wrote:

ATExecAlterColumnType records everything that depends on the column
and for indexes it saves the definition (via pg_get_indexdef_string).
Definition is not sufficient for reconstructing the replica identity
information because there is not such keyword for replica identity in
CREATE INDEX. The new index should call relation_mark_replica_identity
to fix pg_index.indisreplident.

Yeah, I don't think we need to do the full dance of reverse compiling
the SQL command and reexecuting it, as the patch currently does.
That's only necessary for rebuilding the index itself.  For re-setting
the replica identity, we can just use the internal API as you say.

Also, a few test cases would be nice for this patch.

I'm a little busy. I'll write a new patch in a few days.

new patch attached.

test case 1:
create table t1 (col1 int,col2 int not null,
col3 int not null,unique(col2,col3));
alter table t1 replica identity using index t1_col2_col3_key;
alter table t1 alter col3 type text;
alter table t1 alter col3 type smallint using col3::int;

test case 2:
create table t2 (col1 varchar(10), col2 text not null,
col3 timestamp not null,unique(col2,col3),
col4 int not null unique);
alter table t2 replica identity using index t2_col2_col3_key;
alter table t2 alter col3 type text;
alter table t2 replica identity using index t2_col4_key;
alter table t2 alter col4 type timestamp using '2020-02-11'::timestamp;

Attachments:

replfieldtype.patchtext/plain; charset=UTF-8; name=replfieldtype.patch; x-mac-creator=0; x-mac-type=0Download+75-4
#9Peter Eisentraut
peter_e@gmx.net
In reply to: Quan Zongliang (#8)
Re: Restore replication settings when modifying a field type

On 2020-02-11 00:38, Quan Zongliang wrote:

new patch attached.

I didn't like so much how the updating of the replica identity was
hacked into the middle of ATRewriteCatalogs(). I have an alternative
proposal in the attached patch that queues up an ALTER TABLE ... REPLICA
IDENTITY command into the normal ALTER TABLE processing. I have also
added tests to the test suite.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v4-0001-Preserve-replica-identity-index-across-ALTER-TABL.patchtext/plain; charset=UTF-8; name=v4-0001-Preserve-replica-identity-index-across-ALTER-TABL.patch; x-mac-creator=0; x-mac-type=0Download+133-1
#10Euler Taveira
euler.taveira@2ndquadrant.com
In reply to: Peter Eisentraut (#9)
Re: Restore replication settings when modifying a field type

On Thu, 5 Mar 2020 at 09:45, Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

On 2020-02-11 00:38, Quan Zongliang wrote:

new patch attached.

I didn't like so much how the updating of the replica identity was
hacked into the middle of ATRewriteCatalogs(). I have an alternative
proposal in the attached patch that queues up an ALTER TABLE ... REPLICA
IDENTITY command into the normal ALTER TABLE processing. I have also
added tests to the test suite.

LGTM. Tests are ok. I've rebased it (because

61d7c7bce3686ec02bd64abac742dd35ed9b9b01). Are you planning to backpatch
it? IMHO you should because it is a bug (since REPLICA IDENTITY was
introduced in 9.4). This patch can be applied as-is in 12 but not to other
older branches. I attached new patches.

Regards,

--
Euler Taveira http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v5-11-0001-Preserve-replica-identity-index-across-ALTER-TABLE-r.patchapplication/x-patch; name=v5-11-0001-Preserve-replica-identity-index-across-ALTER-TABLE-r.patchDownload+133-1
v5-12-0001-Preserve-replica-identity-index-across-ALTER-TABLE-r.patchapplication/x-patch; name=v5-12-0001-Preserve-replica-identity-index-across-ALTER-TABLE-r.patchDownload+133-1
v5-96-0001-Preserve-replica-identity-index-across-ALTER-TABLE-r.patchapplication/x-patch; name=v5-96-0001-Preserve-replica-identity-index-across-ALTER-TABLE-r.patchDownload+137-1
v5-95-0001-Preserve-replica-identity-index-across-ALTER-TABLE-r.patchapplication/x-patch; name=v5-95-0001-Preserve-replica-identity-index-across-ALTER-TABLE-r.patchDownload+137-1
v5-10-0001-Preserve-replica-identity-index-across-ALTER-TABLE-r.patchapplication/x-patch; name=v5-10-0001-Preserve-replica-identity-index-across-ALTER-TABLE-r.patchDownload+137-1
v5-master-0001-Preserve-replica-identity-index-across-ALTER-TABLE-r.patchapplication/x-patch; name=v5-master-0001-Preserve-replica-identity-index-across-ALTER-TABLE-r.patchDownload+133-1
#11Peter Eisentraut
peter_e@gmx.net
In reply to: Euler Taveira (#10)
Re: Restore replication settings when modifying a field type

On 2020-03-10 14:16, Euler Taveira wrote:

On Thu, 5 Mar 2020 at 09:45, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com
<mailto:peter.eisentraut@2ndquadrant.com>> wrote:

On 2020-02-11 00:38, Quan Zongliang wrote:

new patch attached.

I didn't like so much how the updating of the replica identity was
hacked into the middle of ATRewriteCatalogs().  I have an alternative
proposal in the attached patch that queues up an ALTER TABLE ...
REPLICA
IDENTITY command into the normal ALTER TABLE processing.  I have also
added tests to the test suite.

LGTM. Tests are ok. I've rebased it (because
61d7c7bce3686ec02bd64abac742dd35ed9b9b01). Are you planning to backpatch
it? IMHO you should because it is a bug (since REPLICA IDENTITY was
introduced in 9.4). This patch can be applied as-is in 12 but not to
other older branches. I attached new patches.

Thanks. This has been committed and backpatched to 9.5.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services