Support logical replication of DDLs
Hello,
One of the most frequently requested improvements from our customers
is to reduce downtime associated with software updates (both major and
minor versions). To do this, we have reviewed potential contributions to
improving logical replication.
I’m working on a patch to support logical replication of data
definition language statements (DDLs). This is a useful feature when a
database in logical replication has lots of tables, functions and
other objects that change over time, such as in online cross major
version upgrade.
I put together a prototype that replicates DDLs using the generic
messages for logical decoding. The idea is to log the candidate DDL
string in ProcessUtilitySlow() using LogLogicalMessge() with a new
flag in WAL record type xl_logical_message indicating it’s a DDL
message. The xl_logical_message record is decoded and sent to the
subscriber via pgoutput. The logical replication worker process is
dispatched for this new DDL message type and executes the command
accordingly.
However, there are still many edge cases to sort out because not every
DDL statement can/should be replicated. Some of these include:
1. DDL involving multiple tables where only some tables are replicated, e.g.
DROP TABLE replicated_foo, notreplicated_bar;
This statement will fail on the subscriber and block logical
replication. It can be detected and filtered on the publisher.
2. Any DDL that calls a volatile function, such as NOW() or RAND(), is
likely to generate a different value on each replica. It is possible
to work around these issues—for example, the publisher can replace any
volatile function calls with a fixed return value when the statement
is logged so that the subscribers all get the same value. We will have
to consider some other cases.
3. CREATE TABLE AS and SELECT INTO, For example:
CREATE TABLE foo AS
SELECT field_1, field_2 FROM bar;
There are a few issues that can occur here. For one, it’s possible
that table bar doesn't exist on the subscriber. Even if “bar” does
exist, it may not be fully up-to-date with the publisher, which would
cause a data mismatch on “foo” between the publisher and subscriber.
4. Statements that have nondeterministic side effects (e.g., as caused
by triggers, stored procedures, user-defined functions) may result in
different side effects occurring on each subscriber.
Whether a DDL should be replicated also depends on what granularity do
we define DDL replication. For example, we can define DDL replication
on these levels:
1. Database level
Allows all DDLs for a database to be replicated except for certain
edge cases (refer to the edge cases mentioned above).
This is likely a major use case, such as in online major version upgrade.
2. Table level
Allows DDLs on the published tables to be replicated except for
certain edge cases.
This is useful for consolidating multiple databases into a single one,
e.g. for analytics.
3. Other fine-grained levels base on the object type such as index,
function, procedure and view etc.
Allows DDLs on certain object types to be replicated. At the moment
I’m unsure of a use case for this.
To implement such DDL replication levels, we need to modify the CREATE
PUBLICATION syntax. For example, to help starting the discussion on
the granularity of DDL replication, we can add a new option list ‘ddl’
in the WITH definition:
CREATE PUBLICATION mypub FOR ALL TABLES with
(publish = ‘insert, update, delete, truncate’, ddl = ‘database’)
CREATE PUBLICATION mypub FOR TABLE T1, T2 with
(publish = ‘insert, update, delete, truncate’, ddl = ‘table’)
CREATE PUBLICATION mypub FOR TABLE T1, T2 with
(publish = ‘insert, update, delete, truncate’, ddl = ‘table, function,
procedure’)
We can probably make “ddl = ‘database’” valid only for the FOR ALL
TABLES publication, because it doesn’t really make sense to replicate
all DDLs for a database when only a subset of tables are being
replicated (which can cause edge case 1 to occur frequently). “ddl =
‘database’” + FOR ALL TABLES is likely where logical replication of
DDL is most useful, i.e. for online major versions upgrades.
Based on the DDL publication levels we can further implement the logic
to conditionally log the DDL commands or to conditionally decode/ship
the logical DDL message.
Thoughts? Your feedback is appreciated.
Thanks,
Zheng Li
Amazon RDS/Aurora for PostgreSQL
Hi Zheng,
I’m working on a patch to support logical replication of data
definition language statements (DDLs).
That's great!
However, there are still many edge cases to sort out because not every
DDL statement can/should be replicated.
Maybe the first implementation shouldn't be perfect as long as known
limitations are documented and the future improvements are unlikely to
break anything for the users. Committing an MVP and iterating on this is
much simpler in terms of development and code review. Also, it will deliver
value to the users and give us feedback sooner.
1. DDL involving multiple tables where only some tables are replicated,
e.g.
DROP TABLE replicated_foo, notreplicated_bar;
I would add DROP TABLE ... CASCADE to the list. Also, let's not forget that
PostgreSQL supports table inheritance and table partitioning.
2. Any DDL that calls a volatile function, such as NOW() or RAND(), is
likely to generate a different value on each replica. It is possible
to work around these issues—for example, the publisher can replace any
volatile function calls with a fixed return value when the statement
is logged so that the subscribers all get the same value. We will have
to consider some other cases.
That would make sense.
[...]
Whether a DDL should be replicated also depends on what granularity do
we define DDL replication. For example, we can define DDL replication
on these levels:1. Database level
Allows all DDLs for a database to be replicated except for certain
edge cases (refer to the edge cases mentioned above).
This is likely a major use case, such as in online major version upgrade.
To my knowledge, this is not a primary use case for logical replication.
Also, I suspect that implementing it may be a bit challenging. What if we
focus on table-level replication for now?
--
Best regards,
Aleksander Alekseev
Em seg., 21 de fev. de 2022 às 13:13, Zheng Li <zhengli10@gmail.com>
escreveu:
2. Table level
Allows DDLs on the published tables to be replicated except for
certain edge cases.Think how to handle triggers and functions with same name but different
purpose.
Publisher
create function public.audit() returns trigger language plpgsql as $$
begin
new.Audit_User = current_user();
new.Audit_Date_Time = now();
return new;
end;$$
create trigger audit before insert or update on foo for each row execute
procedure public.audit();
Subscriber
create function public.audit() returns trigger language plpgsql as $$
begin
insert into Audit(Audit_Date_Time, Audit_User, Schema_Name, Table_Name,
Audit_Action, Field_Values) values(new.Audit_ts, new.Audit_User,
tg_table_schema, tg_table_name, tg_op, row_to_json(case when tg_op =
'DELETE' then old.* else new.* end));
return null;
end;$$
create trigger audit after insert or update or delete on foo for each row
execute procedure public.audit();
regards,
Marcos
Hi Aleksander,
That's great!
Thanks!
Maybe the first implementation shouldn't be perfect as long as known
limitations are documented and the future improvements are unlikely to
break anything for the users. Committing an MVP and iterating on this is
much simpler in terms of development and code review. Also, it will deliver
value to the users and give us feedback sooner.
Agree, I’ll post the prototype when it’s in a bit better shape. Also
I’d like to hear
some more feedback from the community on whether I’m heading the right
direction.
I would add DROP TABLE ... CASCADE to the list. Also, let's not forget that
PostgreSQL supports table inheritance and table partitioning.
Thanks, I will keep this in mind.
1. Database level
Allows all DDLs for a database to be replicated except for certain
edge cases (refer to the edge cases mentioned above).
This is likely a major use case, such as in online major version upgrade.To my knowledge, this is not a primary use case for logical replication.
Also, I suspect that implementing it may be a bit challenging. What if we
focus on table-level replication for now?
I think it is due to the fact that the current limitations in logical
replication are
holding it back in major version upgrade (MVU). Online / reduced downtime MVU
is a popular request from customers, and why we should enhance logical
replication
to support this use case.
Also I think table-level DDL replication is actually more challenging,
especially in
the FOR TABLE case, due to the fact that differences are expected to
occur between the
source and target database. Marcos’ comment also justifies the complexity
in this case. Whereas database-level DDL replication in the FOR ALL
TABLE case is
relatively simple because the source and target database are (almost) identical.
Regards,
Zheng
Hi Zheng,
Also, I suspect that implementing it may be a bit challenging. What if we
focus on table-level replication for now?I think it is due to the fact that the current limitations in logical
replication are
holding it back in major version upgrade (MVU). Online / reduced downtime MVU
is a popular request from customers, and why we should enhance logical
replication
to support this use case.Also I think table-level DDL replication is actually more challenging,
especially in
the FOR TABLE case, due to the fact that differences are expected to
occur between the
source and target database. Marcos’ comment also justifies the complexity
in this case. Whereas database-level DDL replication in the FOR ALL
TABLE case is
relatively simple because the source and target database are (almost) identical.
Sure, I don't insist on implementing table-level replication first.
It's up to you. My point was that it's not necessary to implement
everything at once.
--
Best regards,
Aleksander Alekseev
On Mon, Feb 21, 2022 at 9:43 PM Zheng Li <zhengli10@gmail.com> wrote:
Hello,
One of the most frequently requested improvements from our customers
is to reduce downtime associated with software updates (both major and
minor versions). To do this, we have reviewed potential contributions to
improving logical replication.I’m working on a patch to support logical replication of data
definition language statements (DDLs). This is a useful feature when a
database in logical replication has lots of tables, functions and
other objects that change over time, such as in online cross major
version upgrade.
+1
I put together a prototype that replicates DDLs using the generic
messages for logical decoding. The idea is to log the candidate DDL
string in ProcessUtilitySlow() using LogLogicalMessge() with a new
flag in WAL record type xl_logical_message indicating it’s a DDL
message. The xl_logical_message record is decoded and sent to the
subscriber via pgoutput. The logical replication worker process is
dispatched for this new DDL message type and executes the command
accordingly.
If you don't mind, would you like to share the POC or the branch for this work?
However, there are still many edge cases to sort out because not every
DDL statement can/should be replicated. Some of these include:
3. CREATE TABLE AS and SELECT INTO, For example:
CREATE TABLE foo AS
SELECT field_1, field_2 FROM bar;There are a few issues that can occur here. For one, it’s possible
that table bar doesn't exist on the subscriber. Even if “bar” does
exist, it may not be fully up-to-date with the publisher, which would
cause a data mismatch on “foo” between the publisher and subscriber.
In such cases why don't we just log the table creation WAL for DDL
instead of a complete statement which creates the table and inserts
the tuple? Because we are already WAL logging individual inserts and
once you make sure of replicating the table creation I think the exact
data insertion on the subscriber side will be taken care of by the
insert WALs no?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
+
On Sun, Mar 13, 2022 at 5:05 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Mon, Feb 21, 2022 at 9:43 PM Zheng Li <zhengli10@gmail.com> wrote:
Hello,
One of the most frequently requested improvements from our customers
is to reduce downtime associated with software updates (both major and
minor versions). To do this, we have reviewed potential contributions to
improving logical replication.I’m working on a patch to support logical replication of data
definition language statements (DDLs). This is a useful feature when a
database in logical replication has lots of tables, functions and
other objects that change over time, such as in online cross major
version upgrade.+1
+1
I put together a prototype that replicates DDLs using the generic
messages for logical decoding. The idea is to log the candidate DDL
string in ProcessUtilitySlow() using LogLogicalMessge() with a new
flag in WAL record type xl_logical_message indicating it’s a DDL
message. The xl_logical_message record is decoded and sent to the
subscriber via pgoutput. The logical replication worker process is
dispatched for this new DDL message type and executes the command
accordingly.If you don't mind, would you like to share the POC or the branch for this work?
I would love to try this patch out, would you like to share branch or POC ?
Hi,
If you don't mind, would you like to share the POC or the branch for this work?
The POC patch is attached. It currently supports the following functionalities:
1. Configure either database level or table level DDL replication via
the CREATE PUBLICATION command.
2.Supports replication of DDL of the following types when database
level DDL replication is turned on. Other less common DDL types could
be added later.
T_CreateSchemaStmt
T_CreateStmt
T_CreateForeignTableStmt
T_AlterDomainStmt
T_DefineStmt
T_CompositeTypeStmt
T_CreateEnumStmt
T_CreateRangeStmt
T_AlterEnumStmt
T_ViewStmt
T_CreateFunctionStmt
T_AlterFunctionStmt
T_CreateTrigStmt
T_CreateDomainStmt
T_CreateCastStmt
T_CreateOpClassStmt
T_CreateOpFamilyStmt
T_AlterOpFamilyStmt
T_AlterOperatorStmt
T_AlterTypeStmt
T_GrantStmt
T_AlterCollationStmt
T_AlterTableStmt
T_IndexStmt
3.Supports replication of DDLs of the following types if only table
level DDL replication is turned on.
T_AlterTableStmt
T_IndexStmt
4.Supports seamless DML replication of new tables created via DDL
replication without having to manually running “ALTER SUBSCRIPTION ...
REFRESH PUBLICATION" on the subscriber.
Here is a demo:
source_db=# create publication mypub FOR ALL TABLES with (ddl = ‘database’);
target_db=# create subscription mysub CONNECTION 'dbname=source_db
host=localhost user=myuser port=5432' PUBLICATION mypub;
source_db=#
BEGIN;
CREATE TABLE foo (a int);
CREATE INDEX foo_idx ON foo (a);
ALTER TABLE foo ADD COLUMN b timestamptz;
CREATE FUNCTION foo_ts()
RETURNS trigger AS $$
BEGIN
NEW.b := current_timestamp;
RETURN NEW;
END;
$$
LANGUAGE plpgsql;
CREATE TRIGGER foo_ts BEFORE INSERT OR UPDATE ON foo
FOR EACH ROW EXECUTE FUNCTION foo_ts();
INSERT INTO foo VALUES (1);
COMMIT;
source_db=# select * from foo;
a | b
---+-------------------------------
1 | 2022-03-15 19:07:53.005215+00
(1 row)
target_db=# select * from foo;
a | b
---+-------------------------------
1 | 2022-03-15 19:07:53.005215+00
(1 row)
Here is the remaining work for this patch:
1. Test and handle corner case DDLs, such as the ones listed in my
first email[1]/messages/by-id/CAAD30U+pVmfKwUKy8cbZOnUXyguJ-uBNejwD75Kyo=OjdQGJ9g@mail.gmail.com.
2. Investigate less common DDL types (categorized in function
LogLogicalDDLCommand) and decide if and under what conditions they
should be replicated.
3. Determine if we want to include table-level DDL replication. If so,
we need to continue to enhance and test this feature.
I think this will be ready to consider for commit once 1 and 2 are
resolved. I think 3 can be done in the second phase.
If you are reviewing the code, I’ve provided a detailed description of
the patch below:
1. Syntax, pg_publication and pg_dump change:
Allows the user to configure either database level or table level DDL
replication via the CREATE PUBLICATION command as proposed in my first
email. Two new columns are added to the pg_publication catalog to show
the DDL replication levels, test output publication.out is updated
accordingly. pg_dump is also modified to accommodate the
pg_publication catalog change.
2. Logical logging change
a. A new WAL record type xl_logical_ddl_message is introduced to
support logical logging of DDL command. xl_logical_ddl_message is
similar to the existing xl_logical_message for generic message
logging. The reason for not using xl_logical_message directly as
proposed initially is I found out we need to log more information
(such as user role, search path and potentially more in the future)
than just one string, and we don’t want to make too much changes to
the existing xl_logical_message which may break its current consumers.
b. The logging of DDL command string is processed in function
LogLogicalDDLCommand. We categorize DDL command types into three
categories in this function:
1. replicated in database level replication only (such as CREATE
TABLE, CREATE FUNCTION).
2. replicated in database or table level replication depending on the
configuration (such as ALTER TABLE).
3. not supported for replication or pending investigation.
3. Logical decoding and Reorderbuffer change
Supports logical decoding of the new WAL record
xl_logical_ddl_message. This is similar to the logical decoding of
xl_logical_message. Tests for this change are added in the
test_decoding plugin.
4. Integration with pgoutput
Supports sending and receiving the DDL message using the logical
replication wire protocol. A new LogicalRepMsgType is introduced for
this purpose: LOGICAL_REP_MSG_DDLMESSAGE = 'L'.
5. Logical replication worker change
Supports execution of the DDL command in the original user role and
search_path. For any new table created this way, we also set its
srsubstate in the pg_subscription_rel catalog to SUBREL_STATE_INIT, So
that DML replication can progress on this new table without manually
running "ALTER SUBSCRIPTION ... REFRESH PUBLICATION".
6. TAP test
A new TAP test 030_rep_ddl.pl is added. We mainly focused on testing
the happy path of database level replication so far. Corner case DDLs
and table level DDL replication are still to be carefully tested.
However, there are still many edge cases to sort out because not every
DDL statement can/should be replicated. Some of these include:
3. CREATE TABLE AS and SELECT INTO, For example:
CREATE TABLE foo AS
SELECT field_1, field_2 FROM bar;There are a few issues that can occur here. For one, it’s possible
that table bar doesn't exist on the subscriber. Even if “bar” does
exist, it may not be fully up-to-date with the publisher, which would
cause a data mismatch on “foo” between the publisher and subscriber.
In such cases why don't we just log the table creation WAL for DDL
instead of a complete statement which creates the table and inserts
the tuple? Because we are already WAL logging individual inserts and
once you make sure of replicating the table creation I think the exact
data insertion on the subscriber side will be taken care of by the
insert WALs no?
The table creation WAL and table insert WAL are available. The tricky
part is how do we break down this command into two parts (a normal
CREATE TABLE followed by insertions) either from the parsetree or the
WALs. I’ll have to dig more on this.
I look forward to your feedback.
Regards,
Zheng
[1]: /messages/by-id/CAAD30U+pVmfKwUKy8cbZOnUXyguJ-uBNejwD75Kyo=OjdQGJ9g@mail.gmail.com
Show quoted text
On Sun, Mar 13, 2022 at 7:35 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Mon, Feb 21, 2022 at 9:43 PM Zheng Li <zhengli10@gmail.com> wrote:
Hello,
One of the most frequently requested improvements from our customers
is to reduce downtime associated with software updates (both major and
minor versions). To do this, we have reviewed potential contributions to
improving logical replication.I’m working on a patch to support logical replication of data
definition language statements (DDLs). This is a useful feature when a
database in logical replication has lots of tables, functions and
other objects that change over time, such as in online cross major
version upgrade.+1
I put together a prototype that replicates DDLs using the generic
messages for logical decoding. The idea is to log the candidate DDL
string in ProcessUtilitySlow() using LogLogicalMessge() with a new
flag in WAL record type xl_logical_message indicating it’s a DDL
message. The xl_logical_message record is decoded and sent to the
subscriber via pgoutput. The logical replication worker process is
dispatched for this new DDL message type and executes the command
accordingly.If you don't mind, would you like to share the POC or the branch for this work?
However, there are still many edge cases to sort out because not every
DDL statement can/should be replicated. Some of these include:3. CREATE TABLE AS and SELECT INTO, For example:
CREATE TABLE foo AS
SELECT field_1, field_2 FROM bar;There are a few issues that can occur here. For one, it’s possible
that table bar doesn't exist on the subscriber. Even if “bar” does
exist, it may not be fully up-to-date with the publisher, which would
cause a data mismatch on “foo” between the publisher and subscriber.In such cases why don't we just log the table creation WAL for DDL
instead of a complete statement which creates the table and inserts
the tuple? Because we are already WAL logging individual inserts and
once you make sure of replicating the table creation I think the exact
data insertion on the subscriber side will be taken care of by the
insert WALs no?--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
ddl_replication.patch1application/octet-stream; name=ddl_replication.patch1Download+2047-167
Hello
I think this is a pretty interesting and useful feature.
Did you see some old code I wrote towards this goal?
/messages/by-id/20150215044814.GL3391@alvh.no-ip.org
The intention was that DDL would produce some JSON blob that accurately
describes the DDL that was run; the caller can acquire that and use it
to produce working DDL that doesn't depend on runtime conditions. There
was lots of discussion on doing things this way. It was ultimately
abandoned, but I think it's valuable.
--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
"Porque francamente, si para saber manejarse a uno mismo hubiera que
rendir examen... ¿Quién es el machito que tendría carnet?" (Mafalda)
Hi, Zhang Li
On Thu, 17 Mar 2022 at 05:17, Zheng Li <zhengli10@gmail.com> wrote:
Hi,
If you don't mind, would you like to share the POC or the branch for this work?
The POC patch is attached. It currently supports the following functionalities:
1. Configure either database level or table level DDL replication via
the CREATE PUBLICATION command.2.Supports replication of DDL of the following types when database
level DDL replication is turned on. Other less common DDL types could
be added later.
T_CreateSchemaStmt
T_CreateStmt
T_CreateForeignTableStmt
T_AlterDomainStmt
T_DefineStmt
T_CompositeTypeStmt
T_CreateEnumStmt
T_CreateRangeStmt
T_AlterEnumStmt
T_ViewStmt
T_CreateFunctionStmt
T_AlterFunctionStmt
T_CreateTrigStmt
T_CreateDomainStmt
T_CreateCastStmt
T_CreateOpClassStmt
T_CreateOpFamilyStmt
T_AlterOpFamilyStmt
T_AlterOperatorStmt
T_AlterTypeStmt
T_GrantStmt
T_AlterCollationStmt
T_AlterTableStmt
T_IndexStmt3.Supports replication of DDLs of the following types if only table
level DDL replication is turned on.
T_AlterTableStmt
T_IndexStmt4.Supports seamless DML replication of new tables created via DDL
replication without having to manually running “ALTER SUBSCRIPTION ...
REFRESH PUBLICATION" on the subscriber.
Great! I think we can split the patch into several pieces (at least
implementation and test cases) for easier review.
+ * Simiarl to the generic logical messages, These DDL messages can be either
+ * transactional or non-transactional. Note by default DDLs in PostgreSQL are
+ * transactional.
There is a typo, s/Simiarl/Similar/.
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
On Thu, 17 Mar 2022 at 05:17, Zheng Li <zhengli10@gmail.com> wrote:
Hi,
If you don't mind, would you like to share the POC or the branch for this work?
The POC patch is attached. It currently supports the following functionalities:
Hi,
When I try to run regression test, there has some errors.
make[2]: Entering directory '/home/px/Codes/postgresql/Debug/src/test/subscription'
rm -rf '/home/px/Codes/postgresql/Debug/src/test/subscription'/tmp_check && /usr/bin/mkdir -p '/home/px/Codes/postgresql/Debug/src/test/subscription'/tmp_check && cd /home/px/Codes/postgresql/Debug/../src
/test/subscription && TESTDIR='/home/px/Codes/postgresql/Debug/src/test/subscription' PATH="/home/px/Codes/postgresql/Debug/tmp_install/home/px/Codes/postgresql/Debug/pg/bin:/home/px/Codes/postgresql/Debu
g/src/test/subscription:$PATH" LD_LIBRARY_PATH="/home/px/Codes/postgresql/Debug/tmp_install/home/px/Codes/postgresql/Debug/pg/lib:$LD_LIBRARY_PATH" PGPORT='65432' PG_REGRESS='/home/px/Codes/postgresql/De
bug/src/test/subscription/../../../src/test/regress/pg_regress' /usr/bin/prove -I /home/px/Codes/postgresql/Debug/../src/test/perl/ -I /home/px/Codes/postgresql/Debug/../src/test/subscription t/*.pl
t/001_rep_changes.pl ............... ok
t/002_types.pl ..................... ok
t/003_constraints.pl ............... ok
t/004_sync.pl ...................... 6/? # Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 29 just after 7.
t/004_sync.pl ...................... Dubious, test returned 29 (wstat 7424, 0x1d00)
All 7 subtests passed
t/005_encoding.pl .................. ok
t/006_rewrite.pl ................... 1/? # poll_query_until timed out executing this query:
# SELECT '0/14A8648' <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = 'mysub';
# expecting this output:
# t
# last actual query output:
#
# with stderr:
# Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 29 just after 1.
t/006_rewrite.pl ................... Dubious, test returned 29 (wstat 7424, 0x1d00)
All 1 subtests passed
t/007_ddl.pl ....................... ok
t/008_diff_schema.pl ............... 1/? # Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 29 just after 4.
t/008_diff_schema.pl ............... Dubious, test returned 29 (wstat 7424, 0x1d00)
All 4 subtests passed
t/009_matviews.pl .................. Dubious, test returned 29 (wstat 7424, 0x1d00)
No subtests run
The new patch change the behavior of publication, when we drop a table
on publisher, the table also be dropped on subscriber, and this made the
regression testa failed, since the regression test will try to drop the
table on subscriber.
IMO, we can disable the DDLs replication by default, which makes the
regression test happy. Any thoughts?
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
Hello Alvaro,
I think this is a pretty interesting and useful feature.
Did you see some old code I wrote towards this goal?
/messages/by-id/20150215044814.GL3391@alvh.no-ip.org
The intention was that DDL would produce some JSON blob that accurately
describes the DDL that was run; the caller can acquire that and use it
to produce working DDL that doesn't depend on runtime conditions. There
was lots of discussion on doing things this way. It was ultimately
abandoned, but I think it's valuable.
Thanks for pointing this out. I'm looking into it.
Hello Japin,
The new patch change the behavior of publication, when we drop a table
on publisher, the table also be dropped on subscriber, and this made the
regression testa failed, since the regression test will try to drop the
table on subscriber.
IMO, we can disable the DDLs replication by default, which makes the
regression test happy. Any thoughts?
Good catch, I forgot to run the whole TAP test suite. I think for now
we can simply disable DDL replication for the failing tests when
creating publication: $node_publisher->safe_psql('postgres',
"CREATE PUBLICATION tap_pub FOR ALL TABLES WITH (ddl='')");
I'll fix it in the next version of patch. I'll also work on breaking
down the patch into smaller pieces for ease of review.
Regards,
Zheng
Hello,
Attached please find the broken down patch set. Also fixed the failing
TAP tests Japin reported.
Regards,
Zheng Li
Amazon RDS/Aurora for PostgreSQL
Attachments:
0001-syntax-pg_publication-pg_dump-ddl_replication.patchapplication/octet-stream; name=0001-syntax-pg_publication-pg_dump-ddl_replication.patchDownload+360-148
0002-logical_decoding_ddl_message.patchapplication/octet-stream; name=0002-logical_decoding_ddl_message.patchDownload+1050-13
0003-pgoutput-worker-ddl_replication.patchapplication/octet-stream; name=0003-pgoutput-worker-ddl_replication.patchDownload+645-14
On Fri, 18 Mar 2022 at 00:22, Zheng Li <zhengli10@gmail.com> wrote:
Hello Japin,
The new patch change the behavior of publication, when we drop a table
on publisher, the table also be dropped on subscriber, and this made the
regression testa failed, since the regression test will try to drop the
table on subscriber.IMO, we can disable the DDLs replication by default, which makes the
regression test happy. Any thoughts?Good catch, I forgot to run the whole TAP test suite. I think for now
we can simply disable DDL replication for the failing tests when
creating publication: $node_publisher->safe_psql('postgres',
"CREATE PUBLICATION tap_pub FOR ALL TABLES WITH (ddl='')");
I'll fix it in the next version of patch. I'll also work on breaking
down the patch into smaller pieces for ease of review.
Oh, it doesn't work.
postgres=# CREATE PUBLICATION s1 FOR ALL TABLES WITH(ddl = '');;
ERROR: conflicting or redundant options
Here is the code, I think, you might mean `if (ddl_level_given == NULL)`.
+ if (*ddl_level_given)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options")));
+
+ /*
+ * If publish option was given only the explicitly listed actions
+ * should be published.
+ */
+ pubactions->pubddl_database = false;
+ pubactions->pubddl_table = false;
+
+ *ddl_level_given = true;
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
On Fri, 18 Mar 2022 at 08:18, Zheng Li <zhengli10@gmail.com> wrote:
Hello,
Attached please find the broken down patch set. Also fixed the failing
TAP tests Japin reported.
Thanks for updating the patchset, I will try it later.
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
On Fri, 18 Mar 2022 at 08:22, Japin Li <japinli@hotmail.com> wrote:
On Fri, 18 Mar 2022 at 00:22, Zheng Li <zhengli10@gmail.com> wrote:
Hello Japin,
The new patch change the behavior of publication, when we drop a table
on publisher, the table also be dropped on subscriber, and this made the
regression testa failed, since the regression test will try to drop the
table on subscriber.IMO, we can disable the DDLs replication by default, which makes the
regression test happy. Any thoughts?Good catch, I forgot to run the whole TAP test suite. I think for now
we can simply disable DDL replication for the failing tests when
creating publication: $node_publisher->safe_psql('postgres',
"CREATE PUBLICATION tap_pub FOR ALL TABLES WITH (ddl='')");
I'll fix it in the next version of patch. I'll also work on breaking
down the patch into smaller pieces for ease of review.Oh, it doesn't work.
postgres=# CREATE PUBLICATION s1 FOR ALL TABLES WITH(ddl = '');;
ERROR: conflicting or redundant optionsHere is the code, I think, you might mean `if (ddl_level_given == NULL)`.
+ if (*ddl_level_given) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("conflicting or redundant options"))); + + /* + * If publish option was given only the explicitly listed actions + * should be published. + */ + pubactions->pubddl_database = false; + pubactions->pubddl_table = false; + + *ddl_level_given = true;
Oh, Sorry, I misunderstand it, it just like you forget initialize
*ddl_level_given to false when enter parse_publication_options().
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
On Fri, 18 Mar 2022 at 08:18, Zheng Li <zhengli10@gmail.com> wrote:
Hello,
Attached please find the broken down patch set. Also fixed the failing
TAP tests Japin reported.
Here are some comments for the new patches:
Seems like you forget initializing the *ddl_level_given after entering the
parse_publication_options(), see [1]/messages/by-id/MEYP282MB1669DDF788C623B7F8B64C2EB6139@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM.
+ if (*ddl_level_given)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options")));
We can use the errorConflictingDefElem() to replace the ereport() to make the
error message more readable.
I also think that ddl = '' isn't a good way to disable DDL replication, how
about using ddl = 'none' or something else?
The test_decoding test case cannot work as expected, see below:
diff -U3 /home/px/Codes/postgresql/contrib/test_decoding/expected/ddlmessages.out /home/px/Codes/postgresql/Debug/contrib/test_decoding/results/ddlmessages.out
--- /home/px/Codes/postgresql/contrib/test_decoding/expected/ddlmessages.out 2022-03-18 08:46:57.653922671 +0800
+++ /home/px/Codes/postgresql/Debug/contrib/test_decoding/results/ddlmessages.out 2022-03-18 17:34:33.411563601 +0800
@@ -25,8 +25,8 @@
SELECT pg_drop_replication_slot('regression_slot');
DROP TABLE tab1;
DROP publication mypub;
- data
----------------------------------------------------------------------------------------------------------------------------------------------------
+ data
+-----------------------------------------------------------------------------------------------------------------------------------------------
DDL message: transactional: 1 prefix: role: redacted, search_path: "$user", public, sz: 47 content:CREATE TABLE tab1 (id serial unique, data int);
BEGIN
sequence public.tab1_id_seq: transactional:1 last_value: 1 log_cnt: 0 is_called:0
Since the DDL message contains the username, and we try to replace the username with 'redacted' to avoid this problem,
\o | sed 's/role.*search_path/role: redacted, search_path/g'
However, the title and dash lines may have different lengths for different
usernames. To avoid this problem, how about using a specified username when
initializing the database for this regression test?
I try to disable the ddlmessage regression test,
make[2]: Entering directory '/home/px/Codes/postgresql/Debug/src/bin/pg_dump'
rm -rf '/home/px/Codes/postgresql/Debug/src/bin/pg_dump'/tmp_check && /usr/bin/mkdir -p '/home/px/Codes/postgresql/Debug/src/bin/pg_dump'/tmp_check && cd /home/px/Codes/postgresql/Debug/../src/bin/pg_dump
&& TESTDIR='/home/px/Codes/postgresql/Debug/src/bin/pg_dump' PATH="/home/px/Codes/postgresql/Debug/tmp_install/home/px/Codes/postgresql/Debug/pg/bin:/home/px/Codes/postgresql/Debug/src/bin/pg_dump:$PATH"
LD_LIBRARY_PATH="/home/px/Codes/postgresql/Debug/tmp_install/home/px/Codes/postgresql/Debug/pg/lib:$LD_LIBRARY_PATH" PGPORT='65432' PG_REGRESS='/home/px/Codes/postgresql/Debug/src/bin/pg_dump/../../../s
rc/test/regress/pg_regress' /usr/bin/prove -I /home/px/Codes/postgresql/Debug/../src/test/perl/ -I /home/px/Codes/postgresql/Debug/../src/bin/pg_dump t/*.pl
t/001_basic.pl ................ ok
t/002_pg_dump.pl .............. 13/?
# Failed test 'binary_upgrade: should dump CREATE PUBLICATION pub1'
# at t/002_pg_dump.pl line 3916.
# Review binary_upgrade results in /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI
# Failed test 'binary_upgrade: should dump CREATE PUBLICATION pub2'
# at t/002_pg_dump.pl line 3916.
# Review binary_upgrade results in /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI
# Failed test 'binary_upgrade: should dump CREATE PUBLICATION pub3'
# at t/002_pg_dump.pl line 3916.
# Review binary_upgrade results in /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI
# Failed test 'binary_upgrade: should dump CREATE PUBLICATION pub4'
# at t/002_pg_dump.pl line 3916.
# Review binary_upgrade results in /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI
t/002_pg_dump.pl .............. 240/?
[...]
# Review section_post_data results in /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI
t/002_pg_dump.pl .............. 7258/? # Looks like you failed 84 tests of 7709.
t/002_pg_dump.pl .............. Dubious, test returned 84 (wstat 21504, 0x5400)
Failed 84/7709 subtests
t/003_pg_dump_with_server.pl .. ok
t/010_dump_connstr.pl ......... ok
Test Summary Report
-------------------
t/002_pg_dump.pl (Wstat: 21504 Tests: 7709 Failed: 84)
Failed tests: 136-139, 362-365, 588-591, 1040-1043, 1492-1495
1719-1722, 1946-1949, 2177-2180, 2407-2410
2633-2636, 2859-2862, 3085-3088, 3537-3540
3763-3766, 3989-3992, 4215-4218, 4441-4444
5119-5122, 5345-5348, 6702-6705, 7154-7157
Non-zero exit status: 84
Files=4, Tests=7808, 26 wallclock secs ( 0.46 usr 0.05 sys + 7.98 cusr 2.80 csys = 11.29 CPU)
Result: FAIL
make[2]: *** [Makefile:55: check] Error 1
make[2]: Leaving directory '/home/px/Codes/postgresql/Debug/src/bin/pg_dump'
make[1]/messages/by-id/MEYP282MB1669DDF788C623B7F8B64C2EB6139@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM: *** [Makefile:43: check-pg_dump-recurse] Error 2
make[1]/messages/by-id/MEYP282MB1669DDF788C623B7F8B64C2EB6139@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM: Leaving directory '/home/px/Codes/postgresql/Debug/src/bin'
make: *** [GNUmakefile:71: check-world-src/bin-recurse] Error 2
And, when I try to use git am to apply the patch, it complains:
$ git am ~/0001-syntax-pg_publication-pg_dump-ddl_replication.patch
Patch format detection failed.
[1]: /messages/by-id/MEYP282MB1669DDF788C623B7F8B64C2EB6139@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
Hello,
Thanks for the quick review!
And, when I try to use git am to apply the patch, it complains:
$ git am ~/0001-syntax-pg_publication-pg_dump-ddl_replication.patch
Patch format detection failed.
git apply works for me. I'm not sure why git am complains.
I also created a git branch to make code sharing easier, please try this out:
https://github.com/zli236/postgres/tree/ddl_replication
Seems like you forget initializing the *ddl_level_given after entering the
parse_publication_options(), see [1].+ if (*ddl_level_given) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("conflicting or redundant options")));We can use the errorConflictingDefElem() to replace the ereport() to make the
error message more readable.
Agreed. Fixed in the latest version.
I also think that ddl = '' isn't a good way to disable DDL replication, how
about using ddl = 'none' or something else?
The syntax follows the existing convention of the WITH publish option.
For example in CREATE PUBLICATION mypub WITH (publish = '')
it means don't publish any DML change. So I'd leave it as is for now.
The test_decoding test case cannot work as expected, see below:
.....
DDL message: transactional: 1 prefix: role: redacted, search_path: "$user", public, sz: 47 content:CREATE TABLE tab1 (id serial unique, data int);
BEGIN
sequence public.tab1_id_seq: transactional:1 last_value: 1 log_cnt: 0 is_called:0Since the DDL message contains the username, and we try to replace the username with 'redacted' to avoid this problem,
\o | sed 's/role.*search_path/role: redacted, search_path/g'
However, the title and dash lines may have different lengths for different
usernames. To avoid this problem, how about using a specified username when
initializing the database for this regression test?
I don't understand the question, do you have an example of when the
test doesn't work? It runs fine for me.
t/002_pg_dump.pl .............. 13/?
# Failed test 'binary_upgrade: should dump CREATE PUBLICATION pub1'
# at t/002_pg_dump.pl line 3916.
# Review binary_upgrade results in /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI
......
Failed 84/7709 subtests
t/003_pg_dump_with_server.pl .. ok
t/010_dump_connstr.pl ......... okTest Summary Report
-------------------
t/002_pg_dump.pl (Wstat: 21504 Tests: 7709 Failed: 84)
This is fixed in the latest version. I need to remind myself to run
make check-world in the future.
Regards,
Zheng Li
git branch of the change:
https://github.com/zli236/postgres/tree/ddl_replication
On Sat, 19 Mar 2022 at 01:25, Zheng Li <zhengli10@gmail.com> wrote:
Hello,
Thanks for the quick review!
And, when I try to use git am to apply the patch, it complains:
$ git am ~/0001-syntax-pg_publication-pg_dump-ddl_replication.patch
Patch format detection failed.git apply works for me. I'm not sure why git am complains.
I also created a git branch to make code sharing easier, please try this out:
https://github.com/zli236/postgres/tree/ddl_replication
Yeah, I can use git apply, I'm not sure how did you generate the patch.
I also think that ddl = '' isn't a good way to disable DDL replication, how
about using ddl = 'none' or something else?The syntax follows the existing convention of the WITH publish option.
For example in CREATE PUBLICATION mypub WITH (publish = '')
it means don't publish any DML change. So I'd leave it as is for now.
Agreed.
The test_decoding test case cannot work as expected, see below:
.....
DDL message: transactional: 1 prefix: role: redacted, search_path: "$user", public, sz: 47 content:CREATE TABLE tab1 (id serial unique, data int);
BEGIN
sequence public.tab1_id_seq: transactional:1 last_value: 1 log_cnt: 0 is_called:0Since the DDL message contains the username, and we try to replace the username with 'redacted' to avoid this problem,
\o | sed 's/role.*search_path/role: redacted, search_path/g'
However, the title and dash lines may have different lengths for different
usernames. To avoid this problem, how about using a specified username when
initializing the database for this regression test?I don't understand the question, do you have an example of when the
test doesn't work? It runs fine for me.
You should use a different user that has different length from your current one.
For example:
px@localhost$ make check-world
t/002_pg_dump.pl .............. 13/?
# Failed test 'binary_upgrade: should dump CREATE PUBLICATION pub1'
# at t/002_pg_dump.pl line 3916.
# Review binary_upgrade results in /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI......
Failed 84/7709 subtests
t/003_pg_dump_with_server.pl .. ok
t/010_dump_connstr.pl ......... okTest Summary Report
-------------------
t/002_pg_dump.pl (Wstat: 21504 Tests: 7709 Failed: 84)This is fixed in the latest version. I need to remind myself to run
make check-world in the future.
Thanks for updating the patch.
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.