[PATCH] Logical decoding of TRUNCATE
Hi,
This patch implements support for TRUNCATE statements
in logical replication. The work has mainly done by Simon Riggs then
finished by me. Tests are written by me.
TRUNCATE is treated as a form of DELETE for the purpose of deciding
whether to publish, or not.
The "TRUNCATE behavior when session_replication_role = replica"[1]https://commitfest.postgresql.org/16/1447/ patch
is required from this patch to work correctly with tables referenced by
foreign keys.
[1]: https://commitfest.postgresql.org/16/1447/
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Attachments:
logical_decode_truncate.v10.patchtext/plain; charset=UTF-8; name=logical_decode_truncate.v10.patch; x-mac-creator=0; x-mac-type=0Download+498-48
Hi,
On 2017-12-29 14:15:22 +0100, Marco Nenciarini wrote:
This patch implements support for TRUNCATE statements
in logical replication. The work has mainly done by Simon Riggs then
finished by me. Tests are written by me.TRUNCATE is treated as a form of DELETE for the purpose of deciding
whether to publish, or not.
It'd be good if you explained exactly what the chosen behaviour is, and
why that's the right behaviour in comparison to other alternatives.
- Andres
Hi,
Il 29/12/17 20:55, Andres Freund ha scritto:
Hi,
On 2017-12-29 14:15:22 +0100, Marco Nenciarini wrote:
This patch implements support for TRUNCATE statements
in logical replication. The work has mainly done by Simon Riggs then
finished by me. Tests are written by me.TRUNCATE is treated as a form of DELETE for the purpose of deciding
whether to publish, or not.It'd be good if you explained exactly what the chosen behaviour is, and
why that's the right behaviour in comparison to other alternatives.
here is the description of how the patch works:
* A new WAL record type has been added (XLOG_HEAP_TRUNCATE) to allow a
precise logical decoding of the TRUNCATE operation. It contains the
TRUNCATE flags and the list of the involved tables and sequences. It is
treated as a NO-OP during the WAL reply as all the physical operations
are logged in SMGR WAL records.
* A new TRUNCATE message has been added to the logical protocol. It
carries the information about the truncation of one single table
specifying if CASCADE or RESTART IDENTITY has been specified.
* The ExecuteTruncateGuts function has been extracted from
ExecuteTruncate. This function is used by the logical apply process to
replicate the TRUNCATE operations, honouring the eventual CASCADE and
RESTART IDENTITY flag.
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
On 29 December 2017 at 19:55, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2017-12-29 14:15:22 +0100, Marco Nenciarini wrote:
This patch implements support for TRUNCATE statements
in logical replication. The work has mainly done by Simon Riggs then
finished by me. Tests are written by me.TRUNCATE is treated as a form of DELETE for the purpose of deciding
whether to publish, or not.It'd be good if you explained exactly what the chosen behaviour is, and
why that's the right behaviour in comparison to other alternatives.
At present the patch treats TRUNCATE as if it were a DELETE
which means that
CREATE PUBLICATION insert_only FOR TABLE mydata WITH (publish = 'insert');
will not publish truncates before or after this patch
CREATE PUBLICATION insert_only FOR TABLE mydata;
will now publish TRUNCATEs, although they were ignored in PG10
so PG10 publications will act differently
I had regarded it as a missing piece, but some may view that is a
behaviour change in PG11
Alternatively, we could also use WITH (publish = 'truncate') as a
separate decision.
That is an easy change if we wish it.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi,
I've found some SGML errors in the version v10 of the patch. I've fixed
it in version v11 that is attached.
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Attachments:
logical_decode_truncate.v11.patchtext/plain; charset=UTF-8; name=logical_decode_truncate.v11.patch; x-mac-creator=0; x-mac-type=0Download+503-48
Patch rebased on the current master.
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Attachments:
logical_decode_truncate.v12.patchtext/plain; charset=UTF-8; name=logical_decode_truncate.v12.patch; x-mac-creator=0; x-mac-type=0Download+503-48
Attached here there is the complete list of patches required to pass all
the tests. The 0001 patch is discussed in a separate thread, but I've
posted it also here to ease the review of the 0002.
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Attachments:
0001-truncate_ignore_fks_role_replica.v2.patchtext/plain; charset=UTF-8; name=0001-truncate_ignore_fks_role_replica.v2.patch; x-mac-creator=0; x-mac-type=0Download+21-12
0002-logical_decode_truncate.v12.patchtext/plain; charset=UTF-8; name=0002-logical_decode_truncate.v12.patch; x-mac-creator=0; x-mac-type=0Download+503-48
Hi all,
After commit bbd3363e128daec0e70952c1bb2f12ab1f6f1292 that refactor
subscription tests to use PostgresNode's wait_for_catchup, the patch
needs to be updated to use wait_for_catchup.
Attached there is the updated patch. is discussed in a separate thread
https://commitfest.postgresql.org/16/1447/, but I've posted it also here
to ease the review of the 0002.
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Attachments:
0001-truncate_ignore_fks_role_replica.v2.patchtext/plain; charset=UTF-8; name=0001-truncate_ignore_fks_role_replica.v2.patch; x-mac-creator=0; x-mac-type=0Download+21-12
0002-logical_decode_truncate.v13.patchtext/plain; charset=UTF-8; name=0002-logical_decode_truncate.v13.patch; x-mac-creator=0; x-mac-type=0Download+499-48
Hi,
I reviewed 0001 in its own thread.
So I think that we generally want this patch and I think the design
decisions are right. Namely:
TRUNCATE being treated as DELETE in terms of DML filtering makes sense
to me as it is basically bulk delete, needs to be mentioned in release
notes though.
Adding special message to protocol is appropriate as truncate is more
DML than DDL in sense of manipulating data so it should be replicated
separately from other DDL
Processing relations that were truncated when CASCADE is used separately
is needed because we allow relations to be filtered by logical replication
I see the patch adds new xlog record which is perhaps not ideal but the
current one seems utterly unsuitable for decoding so I guess it's okay,
especially when it's only added for wal_level = logical which it is.
Also TRUNCATE is not exactly high tps operation.
Things I am less convinced about:
The patch will cascade truncation on downstream if cascade was specified
on the upstream, that can potentially be dangerous and we either should
not do it and only truncate the tables which were truncated upstream
(but without restricting because of FKs), leaving the data inconsistent
on downstream (like we do already with DELETE or UPDATE). Or maybe make
it into either subscription or publication option so that user can chose
the behaviour here as I am sure some people will want it to cascade (but
the default should still IMHO be to not cascade as that's safer).
+ /* logicalrep_rel_close call not needed, because ExecuteTruncateGuts + * already closes the relations. Setting localrel to NULL in the map entry + * is still needed. + */ + rel->localrel = NULL;
This is somewhat ugly. Perhaps the ExecuteTruncateGuts should track
which relations it opened and only close those and the rest should be
closed by caller? That should also remove the other ugly part which is
that the ExecuteTruncateGuts modifies the input list. What if caller
wanted to use those relations it sent as parameter later?
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 17 January 2018 at 17:07, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote:
Things I am less convinced about:
The patch will cascade truncation on downstream if cascade was specified
on the upstream, that can potentially be dangerous and we either should
not do it and only truncate the tables which were truncated upstream
(but without restricting because of FKs), leaving the data inconsistent
on downstream (like we do already with DELETE or UPDATE). Or maybe make
it into either subscription or publication option so that user can chose
the behaviour here as I am sure some people will want it to cascade (but
the default should still IMHO be to not cascade as that's safer).
I agree the default should be to NOT cascade.
If someone wants cascading as a publication option, that can be added later.
+ /* logicalrep_rel_close call not needed, because ExecuteTruncateGuts + * already closes the relations. Setting localrel to NULL in the map entry + * is still needed. + */ + rel->localrel = NULL;This is somewhat ugly. Perhaps the ExecuteTruncateGuts should track
which relations it opened and only close those and the rest should be
closed by caller? That should also remove the other ugly part which is
that the ExecuteTruncateGuts modifies the input list. What if caller
wanted to use those relations it sent as parameter later?
Agreed
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi All,
Il 18/01/18 17:48, Simon Riggs ha scritto:
On 17 January 2018 at 17:07, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote:
Things I am less convinced about:
The patch will cascade truncation on downstream if cascade was specified
on the upstream, that can potentially be dangerous and we either should
not do it and only truncate the tables which were truncated upstream
(but without restricting because of FKs), leaving the data inconsistent
on downstream (like we do already with DELETE or UPDATE). Or maybe make
it into either subscription or publication option so that user can chose
the behaviour here as I am sure some people will want it to cascade (but
the default should still IMHO be to not cascade as that's safer).I agree the default should be to NOT cascade.
If someone wants cascading as a publication option, that can be added later.
I agree that not replicating the CASCADE option is the best option
according to POLA principle.
+ /* logicalrep_rel_close call not needed, because ExecuteTruncateGuts + * already closes the relations. Setting localrel to NULL in the map entry + * is still needed. + */ + rel->localrel = NULL;This is somewhat ugly. Perhaps the ExecuteTruncateGuts should track
which relations it opened and only close those and the rest should be
closed by caller? That should also remove the other ugly part which is
that the ExecuteTruncateGuts modifies the input list. What if caller
wanted to use those relations it sent as parameter later?Agreed
Attached a new version of the patch addressing these issues.
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Attachments:
0001-truncate_ignore_fks_role_replica.v3.patchtext/plain; charset=UTF-8; name=0001-truncate_ignore_fks_role_replica.v3.patch; x-mac-creator=0; x-mac-type=0Download+17-8
0002-logical_decode_truncate.v14.patchtext/plain; charset=UTF-8; name=0002-logical_decode_truncate.v14.patch; x-mac-creator=0; x-mac-type=0Download+511-57
On 19/01/18 12:37, Marco Nenciarini wrote:
Hi All,
Il 18/01/18 17:48, Simon Riggs ha scritto:
On 17 January 2018 at 17:07, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote:
Things I am less convinced about:
The patch will cascade truncation on downstream if cascade was specified
on the upstream, that can potentially be dangerous and we either should
not do it and only truncate the tables which were truncated upstream
(but without restricting because of FKs), leaving the data inconsistent
on downstream (like we do already with DELETE or UPDATE). Or maybe make
it into either subscription or publication option so that user can chose
the behaviour here as I am sure some people will want it to cascade (but
the default should still IMHO be to not cascade as that's safer).I agree the default should be to NOT cascade.
If someone wants cascading as a publication option, that can be added later.
I agree that not replicating the CASCADE option is the best option
according to POLA principle.+ /* logicalrep_rel_close call not needed, because ExecuteTruncateGuts + * already closes the relations. Setting localrel to NULL in the map entry + * is still needed. + */ + rel->localrel = NULL;This is somewhat ugly. Perhaps the ExecuteTruncateGuts should track
which relations it opened and only close those and the rest should be
closed by caller? That should also remove the other ugly part which is
that the ExecuteTruncateGuts modifies the input list. What if caller
wanted to use those relations it sent as parameter later?Agreed
Attached a new version of the patch addressing these issues.
Besides the small thing I wrote for the 0001 in the other thread I am
pretty much happy with this now.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 22/01/18 19:45, Petr Jelinek wrote:
On 19/01/18 12:37, Marco Nenciarini wrote:
Hi All,
+ /* logicalrep_rel_close call not needed, because ExecuteTruncateGuts + * already closes the relations. Setting localrel to NULL in the map entry + * is still needed. + */ + rel->localrel = NULL;This is somewhat ugly. Perhaps the ExecuteTruncateGuts should track
which relations it opened and only close those and the rest should be
closed by caller? That should also remove the other ugly part which is
that the ExecuteTruncateGuts modifies the input list. What if caller
wanted to use those relations it sent as parameter later?Agreed
Attached a new version of the patch addressing these issues.
Besides the small thing I wrote for the 0001 in the other thread I am
pretty much happy with this now.
Actually on second look, I don't like the new boolean parameter much.
I'd rather we didn't touch the input list and always close only
relations opened inside the ExecuteTruncateGuts().
It may mean more list(s) but the current interface is still not clean.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Il 22/01/18 23:18, Petr Jelinek ha scritto:
On 22/01/18 19:45, Petr Jelinek wrote:
Actually on second look, I don't like the new boolean parameter much.
I'd rather we didn't touch the input list and always close only
relations opened inside the ExecuteTruncateGuts().It may mean more list(s) but the current interface is still not clean.
Now ExecuteTruncateGuts unconditionally closes the relations that it
opens. The caller has now always the responsibility to close the
relations passed with the explicit_rels list.
Version 15 attached.
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Attachments:
0001-truncate_ignore_fks_role_replica.v4.patchtext/plain; charset=UTF-8; name=0001-truncate_ignore_fks_role_replica.v4.patch; x-mac-creator=0; x-mac-type=0Download+26-8
0002-logical_decode_truncate.v15.patchtext/plain; charset=UTF-8; name=0002-logical_decode_truncate.v15.patch; x-mac-creator=0; x-mac-type=0Download+517-53
Hi,
On 23/01/18 15:38, Marco Nenciarini wrote:
Il 22/01/18 23:18, Petr Jelinek ha scritto:
On 22/01/18 19:45, Petr Jelinek wrote:
Actually on second look, I don't like the new boolean parameter much.
I'd rather we didn't touch the input list and always close only
relations opened inside the ExecuteTruncateGuts().It may mean more list(s) but the current interface is still not clean.
Now ExecuteTruncateGuts unconditionally closes the relations that it
opens. The caller has now always the responsibility to close the
relations passed with the explicit_rels list.
This looks good.
Version 15 attached.
I see you still do CASCADE on the subscriber though.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Il 23/01/18 18:13, Petr Jelinek ha scritto:
Hi,
On 23/01/18 15:38, Marco Nenciarini wrote:
Il 22/01/18 23:18, Petr Jelinek ha scritto:
On 22/01/18 19:45, Petr Jelinek wrote:
Actually on second look, I don't like the new boolean parameter much.
I'd rather we didn't touch the input list and always close only
relations opened inside the ExecuteTruncateGuts().It may mean more list(s) but the current interface is still not clean.
Now ExecuteTruncateGuts unconditionally closes the relations that it
opens. The caller has now always the responsibility to close the
relations passed with the explicit_rels list.This looks good.
Version 15 attached.
I see you still do CASCADE on the subscriber though.
No it doesn't. The following code in worker.c prevents that.
+ /*
+ * Even if we used CASCADE on the upstream master we explicitly
+ * default to replaying changes without further cascading.
+ * This might be later changeable with a user specified option.
+ */
+ cascade = false;
There is also a test that check it works as intended:
+ # should not cascade on replica
+ $node_publisher->safe_psql('postgres', "TRUNCATE tab_rep CASCADE");
+
+ $node_publisher->wait_for_catchup($appname);
+
+ $result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM tab_notrep_fk");
+ is($result, qq(1|-1|-1),
+ 'check replicated truncate does not cascade on replica');
+
+ $result = $node_subscriber->safe_psql('postgres',
+ "SELECT nextval('seq_notrep')");
+ is($result, qq(102),
+ 'check replicated truncate does not restart identities');
+
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
On 23/01/18 18:19, Marco Nenciarini wrote:
Il 23/01/18 18:13, Petr Jelinek ha scritto:
Hi,
On 23/01/18 15:38, Marco Nenciarini wrote:
Il 22/01/18 23:18, Petr Jelinek ha scritto:
On 22/01/18 19:45, Petr Jelinek wrote:
Actually on second look, I don't like the new boolean parameter much.
I'd rather we didn't touch the input list and always close only
relations opened inside the ExecuteTruncateGuts().It may mean more list(s) but the current interface is still not clean.
Now ExecuteTruncateGuts unconditionally closes the relations that it
opens. The caller has now always the responsibility to close the
relations passed with the explicit_rels list.This looks good.
Version 15 attached.
I see you still do CASCADE on the subscriber though.
No it doesn't. The following code in worker.c prevents that.
+ /* + * Even if we used CASCADE on the upstream master we explicitly + * default to replaying changes without further cascading. + * This might be later changeable with a user specified option. + */ + cascade = false;
Ah, that's pretty ugly, why don't we just use DROP_RESTRICT always
instead of this (keeping the comment). Unless you plan to make it option
as part of this patch, the current coding is confusing.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Il 23/01/18 18:25, Petr Jelinek ha scritto:
On 23/01/18 18:19, Marco Nenciarini wrote:
Il 23/01/18 18:13, Petr Jelinek ha scritto:
Hi,
On 23/01/18 15:38, Marco Nenciarini wrote:
Il 22/01/18 23:18, Petr Jelinek ha scritto:
On 22/01/18 19:45, Petr Jelinek wrote:
Actually on second look, I don't like the new boolean parameter much.
I'd rather we didn't touch the input list and always close only
relations opened inside the ExecuteTruncateGuts().It may mean more list(s) but the current interface is still not clean.
Now ExecuteTruncateGuts unconditionally closes the relations that it
opens. The caller has now always the responsibility to close the
relations passed with the explicit_rels list.This looks good.
Version 15 attached.
I see you still do CASCADE on the subscriber though.
No it doesn't. The following code in worker.c prevents that.
+ /* + * Even if we used CASCADE on the upstream master we explicitly + * default to replaying changes without further cascading. + * This might be later changeable with a user specified option. + */ + cascade = false;Ah, that's pretty ugly, why don't we just use DROP_RESTRICT always
instead of this (keeping the comment). Unless you plan to make it option
as part of this patch, the current coding is confusing.
Ok, Removed.
Version 16 attached.
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Attachments:
0001-truncate_ignore_fks_role_replica.v4.patchtext/plain; charset=UTF-8; name=0001-truncate_ignore_fks_role_replica.v4.patch; x-mac-creator=0; x-mac-type=0Download+26-8
0002-logical_decode_truncate.v16.patchtext/plain; charset=UTF-8; name=0002-logical_decode_truncate.v16.patch; x-mac-creator=0; x-mac-type=0Download+512-53
Hi,
On 23/01/18 18:47, Marco Nenciarini wrote:
Il 23/01/18 18:25, Petr Jelinek ha scritto:
On 23/01/18 18:19, Marco Nenciarini wrote:
Il 23/01/18 18:13, Petr Jelinek ha scritto:
Hi,
On 23/01/18 15:38, Marco Nenciarini wrote:
Il 22/01/18 23:18, Petr Jelinek ha scritto:
On 22/01/18 19:45, Petr Jelinek wrote:
Actually on second look, I don't like the new boolean parameter much.
I'd rather we didn't touch the input list and always close only
relations opened inside the ExecuteTruncateGuts().It may mean more list(s) but the current interface is still not clean.
Now ExecuteTruncateGuts unconditionally closes the relations that it
opens. The caller has now always the responsibility to close the
relations passed with the explicit_rels list.This looks good.
Version 15 attached.
I see you still do CASCADE on the subscriber though.
No it doesn't. The following code in worker.c prevents that.
+ /* + * Even if we used CASCADE on the upstream master we explicitly + * default to replaying changes without further cascading. + * This might be later changeable with a user specified option. + */ + cascade = false;Ah, that's pretty ugly, why don't we just use DROP_RESTRICT always
instead of this (keeping the comment). Unless you plan to make it option
as part of this patch, the current coding is confusing.Ok, Removed.
Great, thanks, I think this is ready now so marking as such.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
I wonder whether this should be dealing with sequences at all. We are
not currently publishing any information about sequences, so it seems
weird to do it only here. Also, I'd imagine that if we ever get to
publishing sequence events, then the sequence restarts would be
published as independent events.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services